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6 kbps 



Figure 3 



5/58 



pKS/Lama 2 



3a (PGR 1) 
Lama-up Apal Lama-CIa 



3b (PGR 2) 
Lama-up Apal Lama-signal 



PSP 



MGS 



PSP 



MGS 



Appa-GIa 



Appa-Kpn 



R15/APPA 



Appa-mature 



Appa-Kpn 
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NotI 
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Amp 
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20 kbps 



Lama2/PSP/APPA 
20 kbps 
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Lama2/APPA 
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Figure 3B 
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Figure 5. The nucleic acid sequence of the Lama2/APPA plasmid (SEP ID NO: 1) 

17-JAN-2000 



LOCUS 
DEFINITION 
ACCESSION 
KEYWORDS 



RE FERENCE 

AUTHORS 

JOURNAL 

FEATURES 

DEFINITION M 
ACCESSION 
VERSION 
SOURCE 
ORGANISM 



Lama-appA- 20623 bp DNA CIRCULAR SYN 
Lama 2/APPA transgenic construct 
Lama 2-appA, 

parotid secretory protein; acid glucose-l-phosphatase ; appA 
gene ; 

periplasmic phosphoanhydride phosphohydrolase; artificial 

sequence; 

cloning vector 

1 {bases 1 to 20623) 

Golovan, S., Forsberg, C.W., Phillips, J. 
Unpublished . 

, musculus Psp gene for parotid secretory protein. 
X68699 

X68699.1 GI:53809 
house mouse. 
Mus musculus 



Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Mammalia; 
Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
REFERENCE 1 (bases 3777 to 5332;) 

AUTHORS Svendsen,P., Laursen,J., Krogh-Pedersen, H . and Hjorth,J.P. 
TITLE Novel salivary gland specific binding elements located in the PSP 

proximal enhancer core 



JOURNAL 

MEDLINE 

REFERENCE 

AUTHORS 

TITLE 

JOURNAL 



REFERENCE 
AUTHORS 



Nucleic Acids Res. 26 (11), 2761-2770 (1998) 
98256451 

2 (bases 7147 to 12653; 13952 to 17731) 
Mikkelsen, T . R. 
Direct Submission 

Submitted ( 07-OCT- 1 992 ) T.R. Mikkelsen, Department of Molecular 
Biology, University of Aarhus, CF Mollers Alle 130, 8000 
Aarhus, DENMARK 
3 (bases 7147 to 12653; 13952 to 17731) 

Laursen J, Hjorth JP 



TITLE A cassette for high-level expression in the mouse salivary glands. 



JOURNAL 
MEDLINE 



Gene 1997 Oct 1; 198(1- 
9370303 



2) : 367-72 



FEATURES 



misc feature 



enhancer 



exon 



misc feature 



Location /Qua li fie rs 
source 1 . to 12653; 13952 to 17731 

/organism="Mus musculus" 
/strain="C3H/As" 
/db_xref="taxon: 10090" 
/chromosoine="2 " 

/map="Estimate ; 69 cM from centromere" 
/clone="Lambda YPl, Lambda YP3, Lambda YP7" 
/clone_lib="Lambda-PHAGE (Lambda L47.1)" 
/germline 
/note="Allele : b" 

3777-5332 
/gene="PSP" 

/f unction="salivary gland specific positive acting 
regulatory region" 
7147 . . 8724 

/evidence=experimental 
11778. .11824 
/gene="Psp" 
/note="exon a" 
/number=l 

/ evidence=experimental 
12626. . 14190 
/gene="Psp" 

/note='*exon b fused with 
12644-12652 



exons h and i" 
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Figure 5A: 

/function=" consensus sequence for initiation in higher 
eukaryotes " 
misc_feature 13 952-13965 

/function=" M13mpl8 polylinker" 

DEFINITION E. coli periplasmic phosphoanhydr ide phosphohydrolase (appA) gene, 

ACCESSION M58708 L03370 L03371 L03372 L03373 L03374 L03375 

VERSION M58708.1 GI:145283 

SOURCE Escherichia coli DNA . 

ORGANISM Escherichia coli 

Bacteria; Proteobacter ia ; gamma subdivision; Enterobacteriaceae ; 

Escherichia . 



REFERENCE 1 
AUTHORS 
TITLE 



JOURNAL 
MEDLINE 



{bases 12653. .13951) 
Dassa, J. , Marck,C. and Boquet,P.L. 

The complete nucleotide sequence of the Escherichia coli gene appA 
reveals significant homology between pH 2.5 acid phosphatase 
and glucose- 1-phosphatase 

J. Bacterid. 172 (9), 5497-5500 (1990) 

90368616 



FEATURES 

Source 



sig_peptide 

/gene="appA" 

CDS12653 



Location/Quali f iers 

12653. . 13951 
/organism=="Escher ichia coli" 
/db_xref="taxon: 5 62" 
12653. . 12718 

13951 

/gene="appA" 

/standard_name="acid phosphatase/phytase " 
/transl_table=ll 

/product="periplasmic phosphoanhydride phosphohydrolase' 
/protein_id="AAA72086. 1" 
/db xref="GI :145285" 



mat_peptide 12719 13948 

/gene="appA" 

/product="periplasmic phosphoanhydride phosphohydrolase" 

mutation replace ( 12 659 . . 12661, "gcg changed to gcc") 

/gene="appA" 

/standard_name="A3 mutant" 

/note="created by site directed mutagenesis" 
/citation= [ 3 ] 

/phenotype="silent mutation" 
mutation replace (13934 .. 13936, " ccg changed to ccc") 

/gene="appA" 

/standard_name=" P428 mutant" 

/note="created by site directed mutagenesis" 
/citation= [ 3 ] 

/phenotype=" silent mutation " 
mutation replace ( 13937 .. 13939, " gcg changed to get") 

/gene="appA" 

/s tandard_name=" A429 mutant" 

/note="created by site directed mutagenesis" 



10/58 



Figure SB: 



/citation= [ 3 ] 

/phenotype=" silent mutation 



DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

REFERENCE 

AUTHORS 

TITLE 

JOURNAL 

MEDLINE 

REFERENCE 

AUTHORS 

TITLE 

JOURNAL 

MEDLINE 

FEATURES 

Source 



CDS 



pBlues 
X52327 
X52327 
arti f i 
synthe 
synthe 
arti fx 
1 

Thomas 
Direct 
Submit 
System 
2 

Short, 
Lambda 
vivo e 
Nuclei 
883199 
3 

Alting 
pBlues 
Nuclei 
900679 



cript II KS{+) vector DNA, 



cloning vector; expression vector; vector. 



Stratagene Cloning 
. , La Jolla, CA 92037, 



/ gene 
/produ 



.1 GI: 58061 
cial sequence; 
tic construct, 
tic construct 
cial sequence. 

(bases 17732 to 20623) 
, E . A. 

Submission 
ted (20-FEB-1990) Thomas E.A., 
s, 11099 North Torney Pines Rd 

(bases 17732 to 20623) 
J.M., Fernandez, J. M. , Sorge,J.A. and Huse,W,D. 

ZAP: a bacteriophage lambda expression vector with in 
xcision properties 

c Acids Res. 16 (15), 7583-7600 (1988) 
44 

(bases 17732 to 20623) 
-Mees,M.A. and Short, J. M. 
cript II: gene mapping vectors 
c Acids Res. 17 (22), 9494 (1989) 
67 

Locat ion/Quali f ier s 
17732 to 20623 

/organ ism= "synthetic construct" 
/db_xref="taxon: 32 63 0" 
complement (18967. .19827) 
"Amp" 

ct="b- lactamase" 



USA 



BASE COUNT 
ORIGIN 

1 TCGAGAGTAT 

61 ATCT7\AACTA 

121 TGTTGAACT^ 

181 CTGAGGAGAC 

24 1 AGGGTGGTTC 

301 AAGCTACCCC 

361 GCCGGACAGT 

421 AGGGATTGAG 

481 ACAAAGCTGC 

541 ACAGCATAAT 

601 ATAAAAGGAC 

661 TTTAAGTAGG 

721 GTCTCTTACT 

781 GGACAATATA 

84 1 CACCAAGACT 

901 GTGGTGGTGA 

961 CACACTGGAG 

1021 GCGGGGCGTG 

1081 TCTGAGTTCC 

1141 AAAAACCCTG 

1201 ACCAAACCAA 

1261 TCCTAGATAT 

1321 ACTACACTGT 

1381 GGATAGGTAA 

144 1 TCATTTTTCT 

1501 TGAAGATACT 

1561 CTATCCTTAC 

1621 ATGTAATATC 

1681 TTCTTCTTTT 



5449 a 4847 c 4902 g 5424 t 



CTTTGTCAGC 
ATTAATTAAT 
GTTCTCCAAA 
ACCTGCATCT 
TGTGGGACAG 
AAACGACAGA 
GAGACAGACA 
AGACCCTGAC 
CAAAGACCAA 
AAGCAGAGTG 
AGTATTACAG 
GTAAAGTACT 
GTTTAAATGA 
TAT TT AG AG A 
GCAGCACACC 
AGATGTACTA 
CAACCACTGT 
GTGGCATACA 
AGGCCAGCCT 
CCTTGATTAA 
ACCAAACCAG 
ATACCCAATG 
TCACCACAGC 
CTTTCAAGGT 
TTATGAGGTG 
ACACTGGTCC 
CATCATTTGT 
AGTGTGAGGA 
GAAAACTGTC 



TGTGCCTCCA ACAAAGGGGT 
CCCTCACCCG CAAATCTTTC 
GGAGAGATAC AGATGAGTGC 
GACTAAGAAG AGCCACGGTG 
TAGAAAATCG AGAGGCATGT 
GATTGTCAGT CAGGCCAATC 
CACCTACTCA GTTGGAGGAA 
AGGCGCAAGG CCCTAACACA 
AGACTTGTTC TCCATTAGAA 
TACTCTGATT GGAGAACTTT 
ATTTTGTTGT ACACTGCTGT 
CTTTAAAAAT GGGTCCTAGA 
TTTTTATTTT GTTTAATATG 
AAGATGGTTA GCTGTCAGAA 
CCTGTCAGAT GGCTGTGATC 
AAGGG7VAACA CACACACACA 
GGAAATCAGT ATGAATGGTC 
CTTTTATTCC CAGCACTGGG 
GGTCTATAGC ACAGGTTCTA 
ACCAAACCAA ACC7VAACCAA 
ACCAAACCAA AACACTGAAG 
GAGACTAAGT CAGCAAGACA 
CAGGCTGTGG AACCAGCCTG 
AAATGGACTC TGCTGTGTAC 
TCCATTCAGG AGTCACATGG 
CCACAGTTTA CACTTTTATC 
TGTAATTTTT CTTGATGACC 
AGTACAACTT GTTTTCTAAG 
GGTTCCTGAC ATCTGCTCAG 
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ACTGTTGCCC 
AGTCACTAAG 
GTATAGGGTG 
TTAGTTGAAT 
GCCGTTTAGT 
CGTTTCGAGT 
GGATGAGAAC 
CACACCTACC 
ATGACAGCTG 
AATGTGTTTC 
TACATGTGGG 
TATTTTTTCC 
GAGGAAAAAG 
AAATATGCAA 
AAGAAAATAA 
CACACACACA 
CTCAAAAACC 
GAGGCAGAGG 
GGACAGCCAG 
ACCAAACCAA 
ATAGAACTTC 
CCTGCACAGC 
AGTGTCCATG 
ATGCCTCACA 
TAGTTCTATT 
AGCAGTGAAT 
CTCTTTCTGA 
TATTTATTGG 
GTATTCATTG 



ACATAGAAAG 
TTAGCACGAT 
GACCTGGCTG 
GGTGTGGAGT 
GAACTGATGG 
TTGATGGGCA 
AATGGCCAGC 
ACCTCACTTG 
GCTTGACCCG 
ATTCAGTATT 
GCAGTGTGTC 
TTTAACTCAA 
AAGCGTAAAT 
ATCAT^AATCA 
ATGACAATGA 
CACACACACA 
TGAAGATAGA 
CAGGTGGATC 
GGCTACACAG 
ACCAAACCAA 
AGTATTCCAT 
CATGTTCACT 
ATAAATGAAT 
TTCTGTTTAT 
TTCAGTCTTC 
AAGGGTTCCT 
CAGGGATAGG 
CCCCTTGCAT 
GATGTTGTTT 



:ure 5C: 



1741 CTTTGGTGTT TGAGTTCTTA TGAATTCTAG 
1801 ATTCTGTAGG CTGCCTCCTC ACCCTGGCAA 
18 61 CTTCATGGAA TCTCATTTGT CAGTTTTCCC 
1921 GTTTTTACAG AGCCCTGGTC TATGCCTTTA 
1981 CTTACATTTA GATCTTTGAT CCACTTTGAA 
2041 TCTAGTTCCA TTCTTCCATA TGTGATCCTA 
2101 TTATTTTATT TTTAAATAAT GTGTCATAAA 
2161 TTTCTTTGTC CTTTGATCTA CAGGTCTTGT 
2 221 ATGGCTCTGT CATACAGTCT GAGGTCAGGT 
22 81 AGACTCAGGT TTGCTTTGGC CAGGAGTCAT 
2 341 ATGTAGCTGC TACTATTCTT AGTTGATAAA 
2 401 GTCTTGAACT ACTTCTGGGG AGGTGAAACG 
2 4 61 TGCTCCAGTA GCTGTCGGGT GCTGGGCTAC 
2 521 GAGGTGG7VAA AACTCAGCCT CCCTTGGGGT 
2 581 GGAAACCTCA TGGAGTCTGA AAGGAAGGGT 
2 641 TGGGGCTGGG ATCTCCC7VVA CACCTGGATA 
2 701 GAACAGAGTT GGGATGTCCA TGGACCTGTG 
27 61 TGGCTTTACT AATTTGCGAA AGTCCTTAGC 
2 821 GCCTTCTGTA AGAGGCTCAG GCAGTGCCGC 
2 881 CATGGTGGTT CTTGATGAAA GAGACAGTCC 

2 94 1 TTGTGGAAAA TGGGTGCACA CCACCTTCTC 

3 001 AGGGAGGAAT ATCTGGGAAG GGACGCTTAC 
3 061 CATTAGCATG GAGAACTCTG TTCTGGGCTA 
3121 CATGTGGGAA GTGTGGCACA TGTTCTAGGC 
3181 TACCATCCCA GGTGGGTGCC TGGGTGCCAG 

32 4 1 TTCCTGGCAG GGTCCACTGT CCTACACAGA 
3301 GTGGAATGTC CCATGCTGCT TGGGGCTCAG 

33 61 ATAAAAAGTG GGGATACTTT ATTATTCTCT 
3 421 AGTCCAGGAA CCACACCCTG AGGTTCCTGC 

34 81 TCTCCCCTTC ACAGAGCTGC CATVAGTCTAG 
3 541 GTAAGCAGAC AACAGCATTT GTTTACTC7\A 
3 601 AAGTTGAGAC ACCATGCTGG CTTGAGGAAG 
3 6 61 GAAGAAGAAG GGGCAAGTGG AGTTAGCCTG 
3721 CCATGAAGGC TCAAGTGGAG GGCAAGACCT 
37 81 CCTGGGAACC CCTCTACCAT GACACACATT 
3841 TCTTATTTGG ATCTATCATG GTGTTCTGTG 
3 901 GAAAGTTATA T/VAAAACTVAG TCCCCCCCCC 

3 961 TTGTCTCAAG TGTCTCTCTA ATCAGAAACA 

4 021 GTTTCCTCCT TCCTTGCTGA GCCTTGGACA 
4 081 TGGGCAGAGA CTCCAAGGTG GGGAGAGACT 
4141 GGTACACCCA CTCCTCTGCC TGTGTGGTTC 
4 201 TGGGTCTTCC ATGGGCAACA CGCAGAGGGA 
4 2 61 TAGTTTACCC CGGCCATGCT CTCTGCTCTT 
4 321 CTCTGCAGGA ATCATATCTT CATATTGGCC 
4 381 GTTTACTTTA GAGTGACCTT AGCAGGGCTG 
44 41 ATGCTAGGGA AGAAACGTCT TCTAACTACT 
4501 GCCTTTCCCT TGTTAAAGTC ACCTTGAAGT 
4 5 61 GAGTAAATAT GGTCCTGAAG ATTTCCTTTG 
4 621 CCTCTTTGTA CCTT7VAGTCA TTTGGGGTTG 
4 681 TATCAAAGAG TGAGATGGTT ACATAAGAGG 
4 741 GTGGCATGTG ACATCCTCAG GCCTTGCTCT 
4 801 TAAGAGGTCA TTTCCTGGAG GCTGTCACTA 
4 8 61 CCAGGCCCTG CCTGAGGATA GACATGTGCT 
4 921 AGAGTTAGGT TCACAGAAGG GAGGGTGGGA 
4 981 CACCAGCTCC TGACCACCCG GTCAGCCCAT 
5041 TTTACTCAGT GTGGTGTTTG TTGGGACCCA 
5101 GATACACAGG GCAGCATGAG GGTCCTCAGC 
5161 ACCAGCACAC ATTCCTTCAA CCAACTATGT 
5221 ATTGCATTTA TGAGACAGCT AAAATGTACT 
52 81 GCAAGTGCCA TGAGTGGCAG AGGGACAGCC 
5341 CACAGCTTAG CTCCCTGGTG TTGGTTCAAA 
54 01 TTTGACATAT TTAAACAGAG CACAACTTTG 
54 61 ATAAAGCTTA AGGCATGACT ACATTAAAAT 
5521 CAAGAATGGT TCTATTGACT GAGAAATAAT 
5581 CAGGGATAAG TAAAATACTA AACTCTTTTG 



ATGTTA7VATC CCTGCCTGTG GTTCTCTCCC 
TTGTTGTCCT TGTTTTGCAG AAACTTTTGA 
TCCTCTGCTA TAGCCTGAGC TAATGCACTG 
TCCTCCTCTG GCAGCTTCGG AGTTTCATTT 
CAAGTTTTGG AGCAGGGTGA GAGATACGAA 
GTTTACATAG CATCGTTGGT TGAAGAGGTT 
AAACGAGGTG GTTGTAGCAG TGTGGATTTG 
TTTGTGTCAG TCTCATGATG TTTTATTGCT 
ATTGTGATAT ACCTTCAGTA TTGCTCCCTC 
CTTACTCAGT GCTCTTAGAG CTCCCCCAGC 
TCAGGAAACT GGGGCTCAGA GAGATTAACT 
TGGAGACACT AAACTGTGTT TACCCTGTAC 
AGCAAAGCAC CTATACTATA TATTACTCAG 
TCCCAAGCTC CCAGGTGTCC AGTCACTGCT 
TGAGGGTACA TGGGGCAGCG ATGAGGAGCC 
TCCAGATGCC ACTGGGTCAG GGGGAGTTGG 
ACAAGGCCAG GGCCAGGGGG AGGATAACTC 
TTAGCAGCAG TTGTCTGGGA GCACAGAGGG 
TCTGTAGGCG AAGGTCTTCT CCATGTTCCC 
TTGGCTCCAA ACTGGTTTAT TGATTGTTCA 
AGGGTGGACC AGAGATCAAA TACCTTTTGC 
TGGCTAAACC CTCAGGGCCT CTAGATACAT 
CATGACCACA GGCCACATTT CCACAAGCCA 
CAGGAATCTG GTAGGGAGCG TGGAGCCACC 
GGACCCTGAA CCCGCTCAAC CTTACCAAGT 
AGCTGGAGGA GGTGTGAGGG TTGTGTCTTT 
TTTCTCCACC TGTACCTCAT TGGTTTGGGT 
GACTCGGTCC TGAGGT^AAAA GCATCGTGGC 
ACTGAAGGGA CTCCCTAAGT CTCTGGAGTC 
GTTCTTTTGA GGATAACAGA GCCATGCTTG 
CCTTCTTTTG TCAGCTCCCT CTTCATAAAC 
ACTTCTAAAG CCAGACAACT GTGCAAGGAA 
GATGTAGCCC TCAAAGTCTC CAGAGACCAG 
GCAGCAGCCA AGCATCTGGC AGGAGAGGAT 
CTTCCTGCAG GTCACACTTA ATAGGCCATT 
CGAGATTAAT GAGGTGTTAT GCTGCGAACA 
TTGTCACTGC TGCTAAGAAT GTAGCAGAAA 
ATA7VAGGTCT CCTTGGATTC AAGCCCTCCA 
CCCATACAAA CCTCCTGGAT GCTACAGCTC 
GATGGTACAA AAGCAAAATA CTTGTTTGGG 
CTGCAGTCAG TCCTGCAGAC AGGCCCTCAG 
GGCAATGGAT GGGAATACCC ACACCCTGGT 
CATCCCTCCT CTGCCCTCTG CCACGGCTTT 
CACAGGTGTT CTCCTCACCC TAGCTATGAT 
GTGGGAATGA GTTCTAGAAG GCTCACGGAG 
GAGGTTACTA AGTTCCTGGT GGTTGTCTCT 
TAGTGCAGAA GAAATCAGAG CCCAGTCACA 
AGTGCCCAGA ATCCATGACA TTTCAAGAGC 
TATCTTCTGC TTGATGTATG TGTGTGTGTT 
TGCTCTAAAG GACAGAGAGG ATTTGCAATT 
GGTGCCAGGA GGAACTGATG CAGAAAAGAG 
TAGAGGAGAT CTTACAGTGC ATTCCCTCCT 
GACTGCMCT GAAACAGAGG CTTGGGATGG 
GATGGATGCT TGCTGGGTTC TGGGTCTCAT 
GTGCTTATTC CATAGCTTTC TTTTGCTATG 
GCAGAAGCCA GTCCCAGGCT GACAGCTGTG 
CTGAAGCAGT CAGGCTGGCA GAAGAGAAAG 
CTTGAAAAAC AAACATATTA TATCACATAT 
CGGGTAGCAT GACTCCAGGT GGGGATATCT 
AATGTGAGGC AAGAAGGAAT TCTGGCTCAA 
CTTTGAGAGT TTGACCACAA GCACTTTATT 
GGAAAAAGTT TTCTTATGAA AATTATCACA 
GCCTTTGCAA AGTATATGTG CCCTCTTCCA 
GTTCAGGATA AAGATCCAGG AAGAAAAGAT 
CAAAGTACAT AGACCCTCTT TCATAACAAT 



Figure 5D: 

5 641 GGGTTCTATT GACTGACAAG CACTGCTCAG GAGTTGGGAA AGAGTCTAGC ATAAGCACGA 
5701 TAGCCTGGAG ACTCTAGTGA GGTCTAGTCT TACAGACAGC AAAMTCACC AGGTTACAAA 
5761 CTACATTCAT TTCCAGTTTT CTGATCAGGC ACAGGTATGA ATCCCTTCTG TTGAAGAGAA 
5821 AAGTCCATGT GTTTA7\AATA TCTGGTTTCT CCAGTGCTAT TAGCGAGAAG ACTTGAGCCC 
5881 TATACAACTC CCACCTGGAG TGACATCCTG TCTTCATGGT ATATTACATA CCTAGACACG 

5 941 CTCATCTCAC AGACTTAGGA CTTTGTCTTC TGATCTCCAT TTCTGATCCC ACTTCCACCT 
6001 TTGCCTTGAT AGTGTCATTT TCTTCACTGC CTTGGTGACA ACCATGTTAT CCTCTGTGTA 
6061 TTTGAGTGTT ACCATTTTCA GATTTTACCT GTATGCAAGA TCACACAGTC TTTGTCTTTC 
6121 TGTCTGGATG CATGCTAATC TCTACACAAC AACCCTTCCC CGTCACTCAG ATCTTCCTCC 
6181 ATTAACACAT ACATGGTGCT GAAGAGGCTA GGGAGCTTCC CTTCAGTGGG GAGCTAGCTG 
62 41 GCTATTGGGC CTTTTTGACT GTCCAGGAAG GCCCCCAATT GCTGAGACAA GAACTTAGAT 
6301 TCTTCATTAT TGACTCTAAC TCATGTATCA AGCAGAAGCT AATGAATAGT TATCAACAGG 
6361 ATCAGAGGTT CCAGTGTAAG ACACTTTGAC ATGAJU^GAAC GGAGGAAGGA CAGATGGATG 
64 21 CATAAAAGCA GGACCACTGC CCCAGGAAGG TCCTGGAAAC TGATGCAGGG CAAAGGACAG 
64 81 GTTATA7\ACC AAATCTTAGG GAGTCAGGAA GAGCACAGAG GAGCTCAACC AACTGACCAC 
6541 TGCTTAGGGG CTACCAACCC AATCCTCCCT GTGGGAACAG CTAAGCTATC AGCCAAGGGT 

6 601 AATAAACAGG CAGGACCTGT GGATGACATG GAGAGCATAG GGACCCTGGG TCCAGCCTTT 

6 661 AGCACCTGCA CTCTCAGGAT ACTCCACCAT TGTGTCTTAG AGAGCCTAGG GATACTGGGT 
6721 CCAGCCTTTG GTACCTTCAC TCTCAGGGTA CCCCATCACT GTGTCTTGGA GAGCCTAGGC 
6781 ACCCTGGGTC CAGCCTTCAG TACCTGCGCT CTCAGGACAC CCCACCATTG TCTCTTGCCC 
684 1 CGTCTCTTCT TCCTCTTCCT CCCTTTCATT GTCTCTTCTC TGTTTCTTTC TTGACTCTCC 
6901 TTTCCCCTCA CACCCTCACT CTAGTTCTCC CCTTCCCTCT CTGCATCACC CTATTCTCTC 
6961 TGTGGTCCCT CCACTTTCCT TTATCTCTCA TGCTTCTCTC CTCCCTCAAA TACTTGTCAC 

7 021 CCACTATACT TCAGGGGCCA GCTCTAGTGA CAAAGCTGTT AATAGCAAGA CTCTCAGATC 
7 081 TCCAACGGCT CAGAGGAGCC AGACCCACCA AG7VACTCTCT CCAGGTCCAA TTTCAGGTTC 
7141 CTTCGAAAGC TTTCAGCAAA TGCTCAGGGA ACATGCCACT AACAAGAAGA TGCAAATTCC 
72 01 AGTTGAGAGT GGGAAAGGCC CTTGCGTAGG TCCCATCTTC CAGGCCAAGG TCAGAGGGGC 
72 61 TCTGTGTAAT CCGGATTGAC AGGGCTCAGA ACAATGTTTT GTTTTTAAGG TTTATTTATT 
7 321 TTAGGTGTTA GTGTCTTTGC TTGCATGACC TTATGTGCAT CATGTGTGTG CAGGTTCCTG 
7 381 ATGACAGTAG AGGAGGGCTT TGAATCCCTG GGGATAGGAA GTTACAGGAA ATTATAAGCT 
7 441 GCTTTGTGGG TCTTCTAGCT TTCCCAACAG AAGTGAATGC TCTTCACCAC TGAGCCATCT 
7 501 CTCTAGGCCC AAGAGACATT GCTTTATGGA TATAATTGTG TGTGTGTGTC MCATTGAGG 
7 5 61 AAAGGGAAAT AAAAAAAAAA CTTCAGCCGC TAAGGTTGTA CAGTTTCACT AATTGCTACT 
7 621 TTTAGTTGTG ATAAAATGGC AGGTGCTTCA ACATTTATAT ATACAAAAAC TTCCCTGCTG 
7 681 GTGGTTCAAC TGTGAGAACT GGGGTAAGTG GGTGAGTTCT CTTTTTCTGT CTCTGTCTCT 
77 41 GTCTCTCTCC TTCCATTCTT TCTTAAAGGA AATAAACATT GCAGCTGGGT TATAGCTCAT 
7 801 CAATATGGAA GTTACAGAAG TGAAAAAAGG CATTGCCTTG GTGGGTGGTG TTACCAGCTG 
7 8 61 ATTTTTGGTT GTCCTGCAAG GAGGTCTGGG GACTGGCTGC TCTGTCTCTG TCTGTATGAG 
7 921 TGAGGGAAGT CTGGGGAGCA GATTCCCTAA CCTTCAGCCT GGCCTGGTTC CTGAGTGAAC 

7 981 CCAGCCTCTC TGGTCCTAGT AGCTTTTTCC AAACAGGAAT CTGAGTGGTG ACAGGGMCA 
8041 AGTACCAGCC CATTGCTTAA GTGCCAGGGT TAGTGAGGGC AGGAAGCTGC CATAGCTGGG 
8101 ATTAGTAGTT GTATTGGATG TAGGAAGTCC TATCCTGGGA CAGCTAATCC TTAATGCTTC 
8161 ACTGGAGATT TTCAATGAGA AATTTATCCC ACGGCCCATA TGGCCCCATC CTTTTGTCTC 

8 221 CAACAGCCAA GTATTTTCCA TTAGAGGAGA CTTCCTGTAC ACTTGATGGA TGCTCATTCC 
82 81 AAGGTGACTT GGGGCAGTCA GTACAGACTT GGGATGACCT CTGACAGCCT AACCTCTCCC 
8 341 CAACAAGGGC CCTCTATGTT TGCTATGTAA TGTAATGTCA GACATTGTCA GGAGTGTCCG 
8 401 CAGCACAGCC TGCCCAGTGT GAGGGCTCTC ATAGGTTTCC CACTGTCTTA TCTACACAGG 
8 4 61 GATAACGAGG AGGTAAGCTG CAGTTCCCAG TCTCACTTCA CAGAGGAAGA GATAACCCCA 

• 8521 TCCCAGGTCA TGTAGCCAGC AGTGGAAAGA ATGAGGATTT G7VACTCAGGT CTTCCAAGTC 
8581 CCATTGATAG CATCTCCTCA CAAGTCCCTT GCCACCCTCA CGATGCCTTA GACACTTGCC 
8 641 TGCCCTTTAT ACTAAGGAGA TGCAGGTACA AGGGGTTTAC CCATGTAGCA GCTGAGGCAG 
8701 CTGGGGATAG ATACCAGCAG CAGGCCTGAT GTCACCACTC TAACTCCAGC ATCCCCAGTC 
87 61 TGTGTTCCTG GAGTGTGAAA ATCCCTACTT AACAAGATTG TGCAACAGTC CTTGGCTCTG 
8821 TGACCCATAG CTGGAAACAG GATTCTCATT GATTTGTGGA ACATGGTGGC AGCCAGCCAA 
8881 AAAGAGGGTC TGCATACAGA AGACACGTGT GGCAAGGCCA CAGCAGACTC TGACTACCTT 
8 941 AGCTTACAGA ATTACAAGGT CATAATGTCC TCTGCTTTGG TCACCTCATG TTAAGGACAG 
9001 GCCCTAATGA AGATGGGGCA GAAGACTGAA GGAATGGCCA ACCAATAACT GGCCCAACTT 
9061 GAGACCCATC CTACAGGCAA GCATCAATTC CTGACACTAC TAATGATACT CTGTTATGCT 
9121 TGCAGACAGA AGCCTAGCAT AACTATCCTC CGAGAGGTCC ACCCAGCAAC TGACTGAAAC 
9181 AGAAAAAGAT ATCCACAGGC AAACAGTGGA TGGAGGTCAG GGACTATTAT GGGAGAGCTG 
9241 TGGGAAGGAT TAAAAACCCT GAAGGGGATA GGAACCCCAC AGGAAGACCA ACAGAGTCAA 
9301 CTAAGAGACC TGTGGGAGCT CTCAGAGACT GAGCCACCAA CCAAAGAGCA TACACAGGCC 
9361 GGTCCGAGGC ACCTGGCACG TGTGMGCAG ACATGCAGCT CAGTCTCCAT GTAGGTCCTC 
9421 CAATAAGCGG TAGCCTGACT GCAGTATCCA ATCCCCAACA GGGCTGCATA GTCTGGCCTC 
94 81 AGTGGGGGAG GATGCCCCTA ATCCTGCAGA GACTTGATGA GTGGAGAGCT ATCCAGGGGG 
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9541 AACCCACCCT CTCTGAGAAG GGAATGGGGA TGGGGGAGGG ACTCTGTGAA GAGGGGACAA 
9 601 GGACAAACAA GAACCTCAAA TAGGTCAGGC CCTAAAGGCT TGCTAAGTAG CAGTGGCCCA 
9661 GCTCTGTCCT GTTCCTCAGC CCAAGGCTCA GCTCCCACCT GTTTCTGTGT TTTTCTGGCT 
9721 TTTCATGGGC CTAGGACTTG GTGACCAGTT CAAACAATGG GGCCTGTGGA AGACACAATA 
9781 TACAAGACTA GGGACATTCC TGTTCTGCTG ACTATCCATA GCCTGATGTA GGTGG7\AGGA 
9841 CCCAATCACT GGATTTCTAC CCTTGCACAA CCTTGACAGC TGAGGGCCTC TCAG/UVACCT 
9901 ATTTCTTCCA CTGAAAAATG AGACTCTCAA ATGAACGTCG TGACAATCAT CAGGCTTATT 
9961 AAAGAGGTGT ATCTAACCTG AATGGCAAGC AGACAGCAGG CAAATGTCTG TATCAACCTC 
10021 TAGGAAGGAC AAGAACTGCT CACTGCTGCC CCCCAGGAGG CCATTTGCTG AAACAGCTGC 
10081 TCTCCTGCTG GTGCACAGGC CCTGCCTTCT CATTGCAGCC ACAGCCCCTT CCTGTCTGAA 
1014 1 CCTCCTGTCA GGTCACTGGG AAACAGATCA AGATGGAACA GGACAGCTCC TGATGGTAAA 
10201 TAAAAAACAG TGGTCATGGC TATTCATAGG GGTTTATGCT TCTTCAGTCC ACACTGTGAA 
10261 GAGCTGTGGG CATGAACCAC AGTGTTCGAG GTAGAGTTGG GGTTCTGAAA TTCACAGTGG 
10321 GGTGAGCTCA GTAAATGTGA GCTGGAGGTC ACTCGTGAGA CACACAGTCC TGCTGCTTCT 
10381 GTTCCCAATA TCCTGAGGAG ACGACACATC TACTTTGTTC AGAGGCCACA GTCTAGTTGA 
1044 1 CCTGAGAGTT ACCAGTTTCT TATTTGTGTG TGTGTGTGTG TGTGTGTGTG TGTGTGTGTG 
10501 TGTTGTTCGT GTGTGAGTGC AGGTGCACAT ATGATAGCGT ACACGTTGAG GTCAGAGGAT 
10561 AACTATCAGG CGTTGTCCCC TCCTACTTTT CCTCGGACTC TGGAGAACAA ACATGGGTCC 
10 621 TTATTCCAGG GGAGCAAGTC GCTGTTGGCT GACACATCTT GCTCACATAC ATTTTACCTA 
10 681 GACAATGGAG CCTCCATCAG AGTATTACTT TAGCTCCTCA CCGATGGCAA TGCACCACCT 

107 41 CTCTACCCAC ATAGGAGTTG GGTCTCCACA CACCCCCACA CCCCCTTCAC CAAAACGTTT 
10801 TCAGTTACTT TATCTGGTAA AGTTCATCAG AGAATGAAGC CAGTATTAAG AACATGGAAT 

108 61 CATTTGGGAA CCTGGATCTA GCAATACCCC ACCCTAGATG GAGTTGCTGA GTTTTCACCT 
10921 CAGATTATAA TTCCCCCCTA GCTTCTATGG TTTATTCTGA AACCAGGGGA ACTCGATTCC 
10981 TCCCTTTGGA CCACAGACAT CCTGGCTTGT GAATTCACAT GTCATCTACT GCTAATCCAT 
11041 TGGTAGtATG TGGCTCACAG AGACACACTA CAGTCATGGC CAATGTCAAG GTAGGACAGA 
11101 TGTGAATCAT TCCCCCAGTC CTGCTGTTTT CATGACTAAC CCTCCTCAGC ACAGTGACCA 
11161 TGAACCTACT TTTCCCCTCC TTTTATTTTT AGAATTGCTG GAATTTTCTA TTTTGAGAAA 
11221 TAATAGCCTT GGGCAGCATT AAACAAAATC ATCTAGAAAG CTGGTTTTUVA ATACAGATGG 
11281 TTGAGTCAGT GAAAGAGTGA GGAATGTCAT TATTGGCCCC TCACAGAGGC TGGCTCACTC 
11341 CAGCAGAGGT GGTTGAAGCT CTTGGACACG GGTCAGGTGC ATAGGAAAGG TNGTCTGGGA 
11401 CACTGAGAAC CACAATTGAA CAAACAGAAC TGTTGGCTTT TTTTTTTTTA AATGAGTTCT 
114 61 CAAAAAATGA CTGGCTAGCT TAGGCAAATA CTTCGAGCCA ACCCAACAGA ACATTCTTCC 
11521 ATTGATTCAT TCTGGATCTT CTTTCTAGAC AATACTGAAC TGACCCCTTG TTGGCAGTCT 
11581 CAAGTTTGAC AACATAGGGC TTTGAACTTG GCACAAGGTC CATCACTGTC ACCC7VAGCAT 
11641 CCTGGGTGAC CTTTGGGTTG GAATATCTTG GCTAACCTTA GATATTTTCT TTGGAGTATC 
11701 TTTAGAACAT CCAGGAAATA GGGCTTGATT CTCATCCTGG GACCACAATA TAAGTCACCC 
11761 TAGAATCCCA GGAGATCGTG CAGAGAAACA AGGATCTCTC TCGTGTGCAT CCTTCTTCAA 
11821 AGCAGTGAGT AGTGACTCCA CTT^AACTGAG TTCCCATCTG AGAGTCCACA GGAGGCTTTG 
11881 GGGCAAG7VAG CAGAGGGAAG GCACTGTTTG TGTTGGT7VAA GTTTTGACTC TAACAAATTT 
11941 GAAGACATAG ATGACATTGT GTCAGACTAA CAACAACCTA GACTCATGTG GGTTCTGTTT 
12 001 AGGGATCAGA TTTTATTCAT CAATGACTTG TCTTAGTGTA TAGAGAAAGG CTTCCTACTG 
12061 GAGTGTAGGC TCAATAATGA CAGAAGAGAT AGCTATTTCC CCTAGGGACT GTGCTGCTCC 
12121 AAGTTTGGTG GAGAAAGGCA GTGGGGAACC TAGATGTGCT CTCTGGGGAG GGGGTCTGAA 
12181 GCTGGCTTCA TAG7VAGGTGT GAAGTTTTGC TGAAACATCT AAACAGAATT ATAGCTTAGG 
1224 1 AAAGTGAGCA GGCAAGGCAG GGAATGTGTT GCATATGTAT ATGTACATGA ATATATTATG 
12 301 TTATAGATAC ACACACATTT GAACCTCATT TGCAGATGAC AGAAAATAGG TTATTTTGCC 
12361 TCTCTTAACT GCTAAGCACA ATGACTTCCA GTTCCATCCA TTTCCTGAAA TGCCACAATT 
12 421 TCATTTTTCA TTGTGGCTGA ATAAAATTCC ATTGCAGACT GGGCCCTACT TCATCCACTC 
12 481 CTGAGGGCAG GCATATCCCC TGGCTCCATT TCTTACCTAT TGTGAAGAGA AGTGCAACTG 
12 541 TCTTGTTGAA AGGCAAGCGT GAGAGAGGCA GGCACTAATT GTGGGTTTTT GTTTCTTCTT 
12 601 CCTGCTATGA CTCTCCATTT GTCAGAACCA AAGATCGATA AAAGCCGCCA CCATGAAAGC 
12 661 CATCTTAATC CCATTTTTAT CTCTTCTGAT TCCGTTAACC CCGCAATCTG CATTCGCTCA 
12721 GAGTGAGCCG GAGCTGAAGC TGGAAAGTGT GGTGATTGTC AGTCGTCATG . GTGTGCGTGC 
12781 TCCAACCAAG GCCACGCAAC TGATGCAGGA TGTCACCCCA GACGCATGGC CAACCTGGCC 
12841 GGTAAAACTG GGTTGGCTGA CACCGCGCGG TGGTGAGCTA ATCGCCTATC TCGGACATTA 
12 901 CCAACGCCAG CGTCTGGTAG CCGACGGATT GCTGGCGAAA AAGGGCTGCC CGCAGTCTGG 

12 961 TCAGGTCGCG ATTATTGCTG ATGTCGACGA GCGTACCCGT AAAACAGGCG AAGCCTTCGC 

13 021 CGCCGGGCTG GCACCTGACT GTGCAATAAC CGTACATACC CAGGCAGATA CGTCCAGTCC 
13081 CGATCCGTTA TTTAATCCTC TAAAAACTGG CGTTTGCCAA CTGGATAACG CGAACGTGAC 
1314 1 TGACGCGATC CTCAGCAGGG CAGGAGGGTC AATTGCTGAC TTTACCGGGC ATCGGCAAAC 
13201 GGCGTTTCGC GAACTGGAAC GGGTGCTTAA TTTTCCGCAA TCAAACTTGT GCCTTAAACG 
13261 TGAGAAACAG GACGAAAGCT GTTCATTAAC GCAGGCATTA CCATCGGAAC TCAAGGTGAG 
13321 CGCCGACAAT GTCTCATTAA CCGGTGCGGT AAGCCTCGCA TCAATGCTGA CGGAGATATT 
13381 TCTCCTGCAA CAAGCACAGG GAATGCCGGA GCCGGGGTGG GGAAGGATCA CCGATTCACA 
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13441 CCAGTGGAAC ACCTTGCTAA GTTTGCATAA 
13501 AGAGGTTGCC CGCAGCCGCG CCACCCCGTT 
13561 CCATCCACCG CAAAAACAGG CGTATGGTGT 
13 621 CGGACACGAT ACTAATCTGG CAAATCTCGG 
13 681 CGGTCAGCCG CATAACACGC CGCCAGGTGG 
13741 AAGCGATAAC AGCCAGTGGA TTCAGGTTTC 
138 01 TGATAAAACG CCGCTGTCAT TAAATACGCC 
138 61 ATGTGAAGAG CGAAATGCGC AGGGCATGTG 
13 921 TGAAGCACGC ATACCCGCTT GCAGTTTGTA 

13 981 AAGAGGTVAGA ACAGAAGGAT GCCACAACTC 

14 041 TTACTTCTGA TGGCATTTCC CTCTAGAAAG 
14101 GACCACCCAA AGGACCCTCC CAAATTCTCT 
14161 CACCATCCCA GAATTAAAAT CCTAACTGCA 
14 221 AATAAGAGTT GTTGGCAGTG CCAGGCGTGG 
14 281 AGGCAGAGGC AGGCGGATTT CTGAGTTCGA 
14 341 GACAGCCAGG GCTATACAGA GAAACCCTGT 
14 4 01 GTTGGCAGAG TGTGGGTTAT ATACCAGGTG 
14 4 61 CCAGAAGGAA CTTAGAGGAT AGCTCATAAC 
14 521 ATTGAGAGAG TGGGCACACA GCCACTGTGT 
14 581 TACATGCATA AGTGTATATT GGCGCCATCC 
14 641 CGGGGTTAGG TGGCCATGGC CTTTCCTGCC 
14 701 TATGCTCTCT TAACTCTTCC ATTGCTACTT 
14 7 61 CCTTGGGTAC ATCAGTGATC CTGGTGATAT 
14821 GAGGCTGCAA CTAAAGAGGT CTTCTTAATA 
14 881 AGAAGTTCAC AGAGGTGAAG TGATTCATGT 

14 941 GGATTATCTG ACTCTACTCT AACTTTTATG 
15001 TTCCTGTGCT TCAGCTCTGG GAGACTCCCA 
150 61 GACTCTGACA CTCTGCATTG ATTAATTAGC 
15121 TTGTTTCACT TTCCATATAG GCTATGAAGG 
15181 GAGGCAATCC ACCTCTCTCA GGAAGCCTCT 
15241 AACTGTAGGC CCAGTCCTTG GTGTCCAAAA 

15 301 TCCATGTGCT CAAAGGTTTG AACATGGAGC 
15361 TTGAGACTGG ATGCTCTTTG GTCCCATGTT 
15 421 GGCATGCTAC CAGCTACCAC AGACTATGCC 
154 81 TAGACTTGTA TCTCCTAAAA ATGGAATCAA 
15541 TTTCTGTTAA GTGTTTGGTC ACAGGGACAA 
15601 GAGTTGAGGT TCATTGCTCT AGCAAGTTGG 
15 661 ATAAGAGACA TGTAGAAGAG TCTGAAGCTG 
15721 AATAGTTTAA TACACCATGG GAATTGTGAA 
15781 AAAACGTGAG CATGTGGCGT GTGAGAGGGC 
158 41 GAAGCCATTC GGCTACGTTA GGGAACGTGT 
15 901 CTGAATGAGG CCAAATTTTA AAGGAGTGGA 

15 961 CAGACCACCA CTCAGGCTAT GCCGTGTTTG 

16 021 TTGTGAAATT CCAGAGCAAT TATCAGAGCA 
16081 GGTGTGGGTC CCTAAGTGGA TGGTGCATAA 
16141 GATAATCCAA AATATCAGCA ATGTGGAATG 
16201 TAGAACTTTG CTCATGGCTG TAAT7UVATAG 
16261 GTCTGAGTTA CGGTTCCAGG GCAAACATTC 
16321 AGCCAAAGGT CAGCTGGTCA CATTGCATCA 
16381 AGGATACAGG TTATAAAACC TCACTGTCCA 
16441 TTTACCTTCT AAAGATTTTA GTCTTCAAAA 
16501 AAACAAATGA GCCTTTGTGG GGCATTTCAC 
16561 CCTGTGTGCA GTAGGAAGTG TGGCCTCTGT 

16 621 TCTATCTGAG GGACCCTATG AAGATTCAAC 
16681 TGCTACCAAT TTGACATTTG TAGACCTGCT 
16741 CTCCCAACTT TCCAACCCAT ATTCCACATT 
16801 AGGAGAGAAG GAAGGTTAGA AGAGAAAGTG 
16861 GATTAGGGGC AAGTCCAATC GTCATTGTCA 
16921 GCAAATCAGA AACAGCAAAA GCAGCCAACA 
16981 GTAGCGTGGG AGCAGTCACT ACTGGTCTTC 

17 041 AAATTCCGTA ATTTTTTCCC CACCACCTGA 
17101 CAGCTGGCAA AAATCACATC TCTCCTAGAG 
17161 GCAATCTGAA GCATCTCAAT ATCCCACACC 
17221 CATAACTGTT TTTTTTTTCC AATTTTTTAT 
17281 CTATCCCGAA AGTCCCCTAT ACCCTCCCAC 
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CGCGCAATTT TATTTGCTAC AACGCACGCC 
ATTAGATTTG ATCAAGACAG CGTTGACGCC 
GACATTACCC ACTTCAGTGC TGTTTATCGC 
CGGCGCACTG GAGCTCAACT GGACGCTTCC 
TGAACTGGTG TTTGAACGCT GGCGTCGGCT 
GCTGGTCTTC CAGACTTTAC AGCAGATGCG 
GCCCGGAGAG GTGAAACTGA CCCTGGCAGG 
TTCGTTGGCA GGTTTTACGC AAATCGTGAA 
AGGTACCCGG GGATCACAAC TTGCCCTCTG 
TCCTGCTGGC TACTCTCCAG TGGTTTCATC 
TGCTACTATC ATCCACACAT TTCTACCTGA 
TCCTCTCTGA GTAGTCTCCA CACCTGTTAC 
CTCTGGCGTG TGACTTGCCT CAGTCCTTGC 
TGGCGCACGC CTTTAATTCC AGCACTTGGG 
GGCCAGCCTG GTCTACAGAG TGAGTTCCAG 
GTCGAA7VAAC CAAAAAAAAA AAAAAAAGTT 
GAGATTTC7\A ATGAGTGGCT GAAGCTGTAG 
TTAAAAAGAA ATGTAGAGAG TAGCAGAAAC 
GAATGTGGCA GAACACAATC CAGCCAGCTA 
TGACTGATGA GACACAGGAA AACAGATAGA 
TGCCTCTTCC TAAGGGTCAT CTCAAGACCT 
AGCTTCTAGA TATCACCTCC AGATTAGTCT 
CCAGGGCTTC CTGATTCCAT CTTTGTCATA 
CTTCACACCC TGATGCCAAA AGGAAGACAC 
AGGACATACA GTGAGCAAGC ATCAGGGTCC 
TAAATGTGCT TTATGCCATT AACACTGTCA 
AGCACTCTTA GGCACAAGCC ACAATTAAGG 
ATGGTGGTCT CTATGTTTCC AGATTCATGA 
GTGTGAGGAA ATTTTTTGGG GACAGAATTG 
ATCTGGAAAA GCTTACAACT CAGGGACAGT 
TGGGTTTTAT GGTTTGAATC TGCAAAGCCT 
CTCCTCCTGG TAACACTGTA TTGGAGGCTT 
TTGCTACATC ATCTGTCAAG ATATGACCCA 
TCTCCAGCTT TCATGTTCTC CCCACCATGA 
AGCAAACTTT TCCTGCATTA AGTTTTTTTT 
GAAAACACTC AATACAGATA ATTAGTACCA 
ATCAAATTTT TAGGGCTTTG GAACTGATTT 
TGGGCTACAG AAGTGTCACC AGTTTTTAAG 
AATCAGAATG CTCACACAAA GGCAGACAGG 
ATAAGAAGGA ACCTAGGGGG AAATGAGCTA 
GTGGCTGTGC TTGGCCCATG CCCTGGCAAT 
CTAACTCGAT TGTCAGAGAA AATATCAAGA 
TGACCGACCA GCTACTCTTA GCCAGCTCTA 
TGAAGATACA TACAGTTTAG TGAAGTAAGG 
ATCTATGTAG GTGATGCCTA AGTGACACTT 
TCTTCCAAGG AGACCTGTAG ACACACATTT 
CTAGCTAGAA ATCATTTCCT GAAGAGGTTA 
AGTGATGGCA AGGAAGGCAT TGCAGTCAGG 
AGAGTAGAGA GTCAGAGTGT GAGTAGAAAG 
CTCTCAGCAA TCCATTTTCT CCTAAAAGGC 
CCAGTACCAG TAGCCTGGGA ACAAAAGTTG 
ACTTAAAACA GGGCATCACC TAGGAGGAGC 
GTCAGGAATG CTCAGGCTAA TAAGGGGTCC 
AAGTAGTTGT GAGAATTCCC TGTAAATGGA 
ATTGTGTGCT TCTTTATTGG GCTCTCCCAT 
AATCCCTTCC ACCACCATGC AACACTAGGT 
GGTATAGATC TATTTAGACT ACTTCCTGCT 
GGATACCTCC AACCAGCAAC CAGCAAACCA 
AGGCAGCACT AACCAGCAGG ATTGGGGTCG 
TCATGGCTTT GGCATTAATA CTCTCTCAAG 
AATTCCGTAA TTTTAAATGC AAACTATCTA 
CACAAGACAA ATCATAGTTA CTGGCTATTT 
TGGGATTAAA ACAAAAACAT ATTCACATCA 
TAGGTATTTT CTTTATTTAC ATTTCAAATG 
CTCCCTGCTC CCCTACACAC CCACTCCCAC 
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17341 TTTTTGACCC TGGAGTTCCC CGGTACTGGG GCATATMAG TTTGCAAGAC CAAGGGGCCT 

174 01 CTCTTCCCAG TGATGGCCGA CTAAGCCATC TTCTGCTACA TATGCAGATA GAGACACGAG 

17 4 61 CTCTGGGGGT ACTAGTTAGT TCATATTGTT GTTCCACCTA TAGGGTCGCA GACCCCTTCA 

17 521 GCTCCTTGGG TACTTTGTCT AGCTCCTCCA CTGGGGGCTC TGTGTTTTAT CTAATAGATG 

17 581 ACTGTGAGCA TCCACTTCTG TATTTGACAG GCACTGGCCT AGCGTCACAT GAGCCAGCTA 

17641 TATCAGGGTC CTTTCAGCAA AACCTTGCTG GCATGTGCAA TAGTGTCTGC GTTTGGTGGT 

17 701 TGATTATGGG ATGGATCCAC TAGTTCTAGA GCGGCCGCCA CCGCGGTGGA GCTCCAGCTT 

17761 TTGTTCCCTT TAGTGAGGGT TAATTGCGCG CTTGGCGTAA TCATGGTCAT AGCTGTTTCC 

17821 TGTGTGAAAT TGTTATCCGC TCACAATTCC ACACAACATA CGAGCCGGAA GCATAAAGTG 

178 81 TAAAGCCTGG GGTGCCTAAT GAGTGAGCTA ACTCACATTA ATTGCGTTGC GCTCACTGCC 

17 941 CGCTTTCCAG TCGGGAAACC TGTCGTGCCA GCTGCATTAA TGAATCGGCC AACGCGCGGG 
18001 GAGAGGCGGT TTGCGTATTG GGCGCTCTTC CGCTTCCTCG CTCACTGACT CGCTGCGCTC 
180 61 GGTCGTTCGG CTGCGGCGAG CGGTATCAGC TCACTCAAAG GCGGT AATAC GGTTATCCAC 
18121 AGAATCAGGG GATAACGCAG GAAAGAACAT GTGAGCAA7\A GGCCAGCAAA AGGCCAGGAA 
18181 CCGTAAAAAG GCCGCGTTGC TGGCGTTTTT CCATAGGCTC CGCCCCCCTG ACGAGCATCA 
18241 CAPJkPJKTCGA CGCTCAAGTC AGAGGTGGCG AAACCCGACA GGACTATAAA GATACCAGGC 

18 301 GTTTCCCCCT GGAAGCTCCC TCGTGCGCTC TCCTGTTCCG AGCCTGCCGC TTACCGGATA 
18361 CCTGTCCGCC TTTCTCCCTT CGGGAAGCGT GGCGCTTTCT CATAGCTCAC GCTGTAGGTA 
18 421 TCTCAGTTCG GTGTAGGTCG TTCGCTCCAA GCTGGGCTGT GTGCACGAAC CCCCCGTTCA 
184 81 GCCCGACCGC TGCGCCTTAT CCGGTAACTA TCGTCTTGAG TCCAACCCGG TAAGACACGA 
1854 1 CTTATCGCCA CTGGCAGCAG CCACTGGTAA CAGGATTAGC AGAGCGAGGT ATGTAGGCGG 
18 601 TGCTACAGAG TTCTTGAAGT GGTGGCCTAA CTACGGCTAC ACTAGAAGGA CAGTATTTGG 
18 661 TATCTGCGCT CTGCTGAAGC CAGTTACCTT CGGAAAAAGA GTTGGTAGCT CTTGATCCGG 
18 721 CAAACAAACC ACCGCTGGTA GCGGTGGTTT TTTTGTTTGC AAGCAGCAGA TTACGCGCAG 
18781 AAT^AAAAGGA TCTCAAGAAG ATCCTTTGAT CTTTTCTACG GGGTCTGACG CTCAGTGGAA 
188 41 CGAAAACTCA CGTTAAGGGA TTTTGGTCAT GAGATTATCA AAAAGGATCT TCACCTAGAT 
18 901 CCTTTT7VAAT T7UVAAATGAA GTTTT7VAATC AATCTAAAGT ATATATGAGT TWVCTTGGTC 

18 961 TGACAGTTAC C7VATGCTTAA TCAGTGAGGC ACCTATCTCA GCGATCTGTC TATTTCGTTC 
19021 ATCCATAGTT GCCTGACTCC CCGTCGTGTA GAT7VACTACG ATACGGGAGG GCTTACCATC 
19081 TGGCCCCAGT GCTGCAATGA TACCGCGAGA CCCACGCTCA CCGGCTCCAG ATTTATCAGC 
19141 MTAAACCAG CGAGCCGGAA GGGCCGAGCG CAGAAGTGGT CCTGCAACTT TATCCGCCTC 
19201 CATCCAGTCT ATTAATTGTT GCCGGGAAGC TAGAGTAAGT AGTTCGCCAG TTAATAGTTT 
192 61 GCGCAACGTT GTTGCCATTG CTACAGGCAT CGTGGTGTCA CGCTCGTCGT TTGGTATGGC 
19321 TTCATTCAGC TCCGGTTCCC AACGATCAAG GCGAGTTACA TGATCCCCCA TGTTGTGCAA 
19381 AAAAGCGGTT AGCTCCTTCG GTCCTCCGAT CGTTGTCAGA AGTAAGTTGG CCGCAGTGTT 
194 41 ATCACTCATG GTTATGGCAG CACTGCATAA TTCTCTTACT GTCATGCCAT CCGTAAGATG 
19501 CTTTTCTGTG ACTGGTGAGT ACTCAACCAA GTCATTCTGA GAATAGTGTA TGCGGCGACC 
19561 GAGTTGCTCT TGCCCGGCGT CAATACGGGA TAATACCGCG CCACATAGCA GAACTTTAAA 

19 621 AGTGCTCATC ATTGGAAAAC GTTCTTCGGG GCGAAAACTC TCAAGGATCT TACCGCTGTT 
19 681 GAGATCCAGT TCGATGTAAC CCACTCGTGC ACCCAACTGA TCTTCAGCAT CTTTTACTTT 
19741 CACCAGCGTT TCTGGGTGAG CAAAAACAGG AAGGCAAAAT GCCGCAAAAA AGGGAATAAG 
19 801 GGCGACACGG AAATGTTGAA TACTCATACT CTTCCTTTTT CAATATTATT GAAGCATTTA 
198 61 TCAGGGTTAT TGTCTCATGA GCGGATACAT ATTTGAATGT ATTTAGAAAA ATAAACAAAT 
19921 AGGGGTTCCG CGCACATTTC CCCGAAAAGT GCCACCTAAA TTGTAAGCGT TAATATTTTG 
19981 TTAAAATTCG CGTTAAATTT TTGTTAAATC AGCTCATTTT TTAACCAATA GGCCGAAATC 
20041 GGCAAAATCC CTTAT7\AATC AAAAGAATAG ACCGAGATAG GGTTGAGTGT TGTTCCAGTT 
20101 TGGAACAAGA GTCCACTATT AAAGAACGTG GACTCCAACG TCAAAGGGCG AAAAACCGTC 
20161 TATCAGGGCG ATGGCCCACT ACGTGAACCA TCACCCTAAT CAAGTTTTTT GGGGTCGAGG 
2 0221 TGCCGTAAAG CACTAAATCG GAACCCTAAA GGGAGCCCCC GATTTAGAGC TTGACGGGGA 
20281 AAGCCGGCGA ACGTGGCGAG AAAGGAAGGG AAG7VAAGCGA AAGGAGCGGG CGCTAGGGCG 
20341 CTGGCAAGTG TAGCGGTCAC GCTGCGCGTA ACCACCACAC CCGCCGCGCT TAATGCGCCG 
20401 CTACAGGGCG CGTCCCATTC GCCATTCAGG CTGCGCAACT GTTGGGAAGG GCGATCGGTG 
204 61 CGGGCCTCTT CGCTATTACG CCAGCTGGCG AAAGGGGGAT GTGCTGCAAG GCGATTMGT 
20521 TGGGTAACGC CAGGGTTTTC CCAGTCACGA CGTTGTAAAA CGACGGCCAG TGAGCGCGCG 
20581 TAATACGACT CACTATAGGG CG/U^TTGGGT ACCGGGCCCC CCC 
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Figure 18: Nucleic a cid sequence of the known segment of the R15/appa+intron plasmid, 
including the vector sequences of pBLCAT3 (SEP ID NO :2V 



LOCUS R15/appa-f intron 6708 bp DNA SYN 15-APR-2000 

DEFINITION R15/appa-Hintron transgene wiLh vector cut 13543 to 4954 
ACCESSION R15/appa+intron 
REFERENCE 1 (bases 1 to 6708).) 
SOURCE synthetic construct. 

ORGANISM synthetic construct 
artificial sequence, 
salivary proline-rich protein, acid glucose-l-phosphatase ; appA 
gene; periplastic phosphoanhydr ide phosphohydrolase ; artificial 
sequence ; 

Golovan, S., Forsberg, C.W., Phillips, J. 
Unpublished . 



KEYWORDS 



AUTHORS 
JOURNAL 



DEFINITION Rat salivary proline-rich protein (RP15) gene. 
ACCESSION M64793 M36414 
VERSION M64793.1 GI:206711 

SOURCE Rat (Sprague-Dawley) liver DNA. 

ORGANISM Rattus norvegicus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; 



Mammalia ; 

Rattus . 

REFERENCE 
AUTHORS 
TITLE 
encoding 

JOURNAL 
MEDLINE 
FEATURES 

source 



Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; 

1 (bases 1 to 1748) 
Lin,H.H. and Ann,D.K. 

Molecular characterization of rat multigene family 

proline-rich proteins 
Genomics 10, 102-113 (1991) 
91257817 

Location/Qualifiers 
1 . . 1748 

/organism="Rattus norvegicus" 
/strain=" Sprague-Dawley" 
/db_xref="taxbn: 10116" 
/tissue_type=" liver " 

/tissue_lib="cosmid genomic library" 
1802-1810 

/function=" consensus sequence for initiation in 
higher eukaryotes " 



misc feature 



FEATURES Location/Qualifiers 

DEFINITION E. coli periplasmic phosphoanhydride phosphohydrolase (appA) 
gene, 

ACCESSION M58708 L03370 L03371 L03372 L03373 L03374 L03375 

VERSION M58708.1 GI:145283 

SOURCE Escherichia coli DNA. 

ORGANISM Escherichia coli 

Bacteria; Proteobacteria ; gamma subdivision; 
Enterobacteriaceae; 

Escherichia. 

REFERENCE 1 (bases 1811.. 3109) 

AUTHORS Dassa,J., Marck,C. and Boquet,P.L. 
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Figure 18A: 



TITLE The complete nucleotide sequence of the Escherichia coli 

gene appA reveals significant homology between pH 2.5 
acid phosphatase and glucose-l-phosphatase 

JOURNAL J. Bacterid. 172 (9), 5497-5500 (1990) 

MEDLINE 90368616 



FEATURES 

Source 



sig_peptide 

/gene="appA' 

CDS 



Location/Qualifiers 

1811 . . 3109 
/organism="Escherichia coli" 
/db_xref-"taxon: 5 62" 
1811 . . 1876 

1811 . . 3109 
/gene="appA" 

/standard_name="acid phosphatase/phytase" 
/trans l_table= 11 

/product ="periplasmic phosphoanhydr ide 
phosphohydrolase" 
/protein_id="AAA7 2 08 6. 1" 
/db xref="GI : 145285" 



mat_pept ide 



1877 3106 
/gene="appA" 

/product="periplasmic phosphoanhydr ide 
phosphohydrolase" 



mutation replace ( 1817 , . 1819, "gcg changed to gcc") 
/gene="appA" 

/standard_name="A3 mutant" 

/note="created by site directed mutagenesis" 
/phenotype="silent mutation" 
mutation replace ( 30 92 3094 , " ccg changed to ccc") 
/gene="appA" 

/standard_name=" P428 mutant" 

/note="created by site directed mutagenesis" 
/phenotype=" silent mutation " 
mutation replace ( 30 95 3097 , " gcg changed to get") 
/gene="appA" 

/standard_name=" A429 mutant" 

/note="created by site directed mutagenesis" 
/phenotype=" silent mutation " 
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Figure 18B: 



DEFINITION Plasmid pBLCATS (bases 3109 to 6708) 



VERSION 
SOURCE 

ORGANISM 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

REFERENCE 
AUTHORS 
TITLE 

for 

regulatory 

JOURNAL 
MEDLINE 
COMMENT 
experiments 



FEATURES 

source 



X64409.1 GI:58163 
synthetic construct, 
synthetic construct 
artificial sequence. 

1 (bases 3109 to 6708) 
Luckow, B.H.R. 

Direct Submission 

Submitted ( 0 6-FEB- 1 992 ) B.H.R. Luckow, German Cancer Res 
Center, Im Neuenheimer Feld 280, W-6900 Heidelberg, FRG 

2 (bases 3109 to 6708) 
Luckow, B. and Schutz,G. 

CAT constructions with multiple unique restriction sites 
the functional analysis of eukaryotic promoters and 
elements 

Nucleic Acids Res. 15 (13), 5490 (1987) 
87260024 

Promoterless CAT vector for transient transfection 

with eukaryotic cells. Allows the analysis of foreign 
promoters and enhancers . 

Location/Qualifiers 

3109 to 6116 

/organism- "synthetic construct " 
/db xref="taxon: 32630" 



SV40 t intron 

polyA_signal 

CDS 



3197 . . 3810 

/note=="SV40 signals" 
3807 . .4047 

/note="SV40 signals" 
complement (5244 . . 6104 ) 
/codon_start=l 
/transl_table-ll 
/gene="Amp" 

/product="beta-lactamase " 
/protein_id-"CAA4 57 53 . 1" 
/db xref="GI: 58165" 



BASE COUNT 
ORIGIN 

1 



1916 a 1479 c 1515 g 1798 t 



GGATCCCCTT 
61 GAGAGTCCTG 
121 CTCTTTGTTT 
181 ATAGGTCTAA 
241 CATGTAGTAT 
301 TGGAAAAGAC 

3 61 TATTTCACTA 

4 21 AGGTCAACAG 
4 81 TATCCTGGTT 
541 TTAACAATTA 
601 TGGGAAG7VAA 
661 TAAAACATAT 
721 GATTCTCTTT 
7 81 GAGTCTCACA 
841 CACAAATTAA 



TGCTATGTAG 
TTTGGTTTAA 
CTAGCATAAC 
TAACCCCGAA 
CCATAGTCCA 
ATGACAACAT 
AACTAGGTTT 
TGCCACATAT 
AGAGAGTGCT 
AGACAGTATT 
CCATTTGGTG 
GTTTGACCAG 
GGGTGGCTGC 
AAATGAAAAG 
AGAAAACCTG 



TTTTTAATGG 
GCAACCTCTG 
CAAAAGATTT 
AATATTACCA 
TC7VATGAGAG 
TCACAGGCAC 
ATCTATTTTG 
CCTTTACTTA 
TAAAAT7VAGT 
TATTTAAAGC 
AACAATATTT 
CCCTTCTTTT 
AAATTGTCCA 
GAAATATATT 
TGGTGAATGA 

32/58 



AAATTACAAC 
TTTCTCATAA 
AGTGAATTGA 
TGATACTGAG 
AGACATTTAA 
TGCACAGAAC 
TTGCTTTCTC 
ACCTAAGGAA 
TTTCCAAGAA 
AAGAAATATG 
CAAATAAAAA 
CAATAGGCTT 
CGAATAAGAC 
CAGAAAGAGA 
CATCCTGAGG 



CCATAGTGTG 
ACTCCATAAA 
AAACAATGTT 
CATTTGTAAG 
CATGATTTTC 
ATAGTGGTCC 
TAACATCTCT 
CACAAAAAAT 
TGGAA7VAGAA 
AGGCACACAA 
TAGACAAACA 
AATGTGAATA 
AAAATATAAA 
ATCTTGAGAG 
CCTGAGCTAT 



TTGATAAATA 
AACAGGAATA 
CCCTTAGAGT 
TATCTCATAG 
ATTAATCAGG 
ACCTTGCACA 
GCAATGAAGC 
TTTCTACATA 
ATGTTCTGAC 
GAAAATATTT 
TAGTTAATTG 
AAATGTTAAA 
AATAAGGACT 
AATGTGTTGT 
TACTGACATT 



Figure 18C: 



901 T7VAGATAAAG GTAACTGTAT ACATTTGTCC CATTGAGGGG ACAAGAAAGC TGCTCTCATG 
961 TTCAGCTCTA TAATTCTTGC CTTAAACAAC TTAAATAG7VA TGATTTAAAA TATGGAGCTG 
1021 TCCATGGACC TTTGAAATAT AAAATAGTCA AGCAACTTAT CAAGGAATTA CAGATTCCTT 
10 81 GATACTAACA CAGGTAAATC CCACACGTGT TTTCACACTA CATTTGCTGG GATTTTATTG 
1141 ATGTAATAGG TCACATGTTT TTCGGGCCAA TGTTGCTGTT ATTCGGTTAC TTCAAGAGAA 
1201 TAGTGGCAAC TGATGCTATG TATTCTAGGG GTTTGAAGTG ATGTTTCATG ATTGAAATTT 
12 61 GTAAAAGAAT 7VACATCATCA TTCTTAACAA TAGAACATAT AAAGTCACAC AGAAGTGACA 
1321 GTGTTTAAGC TGTACTATTG ATCAAAGAAA TTTATTACCT TCAGTTTCAA TGGAAATAAT 
1381 TACTGATAAT ACAAACATGT GTGAACACAC ACTAATCCTA TCCAAATGCA CAGTGATACA 
14 41 CAGAAAATAT TAGCAAGTAG AATGCAATAT TTATATAACG ATTGTATTTA TCAATCAATT 
1501 GTATGTATCA ATATATGGGC TATTTTCTTA CACATGATTT TATTCAAATT TACTCTAATC 
1561 ATTGTTGAAC CATTTAGAAA AGGCATACTG GCAACTTTTC CTTACCTCAT CCAGCTGGGC 
1621 AAAAGTCCCA GTGTGGAGTA AAGGATGCAA GATTTCCTGC TCTGTTAAGT ATAAAATAAT 
1681 AGTATGAATT CAAAGGTGCC ATTCTTCTGC TTCTAGTTAT AAAGGCAGTG CTTGCTTCTT 

17 41 CCAGCACAGA TCTGGATCTC GAGGAGCTTG GCGAGATTTT CAGGAGCTAA GGAAGCTAAA 
1801 AGCCGCCACC ATGAAAGCCA TCTTAATCCC ATTTTTATCT CTTCTGATTC CGTTAACCCC 

18 61 GCAATCTGCA TTCGCTCAGA GTGAGCCGGA GCTGAAGCTG GAAAGTGTGG TGATTGTCAG 
1921 TCGTCATGGT GTGCGTGCTC CAACCAAGGC CACGCAACTG ATGCAGGATG TCACCCCAGA 
1981 CGCATGGCCA ACCTGGCCGG TAAAACTGGG TTGGCTGACA CCGCGCGGTG GTGAGCTAAT 
2 041 CGCCTATCTC GGACATTACC AACGCCAGCG TCTGGTAGCC GACGGATTGC TGGCGAAAAA 
2101 GGGCTGCCCG CAGTCTGGTC AGGTCGCGAT TATTGCTGAT GTCGACGAGC GTACCCGTAA 
2161 AACAGGCGAA GCCTTCGCCG CCGGGCTGGC ACCTGACTGT GCAATAACCG TACATACCCA 
2 221 GGCAGATACG TCCAGTCCCG ATCCGTTATT TAATCCTCTA AAAACTGGCG TTTGCCAACT 
22 81 GGATAACGCG AACGTGACTG ACGCGATCCT CAGCAGGGCA GGAGGGTCAA TTGCTGACTT 
2 341 TACCGGGCAT CGGCAAACGG CGTTTCGCGA ACTGGAACGG GTGCTTAATT TTCCGCAATC 
2 4 01 AAACTTGTGC CTTAAACGTG AGAAACAGGA CGAAAGCTGT TCATTAACGC AGGCATTACC 
2 4 61 ATCGGAACTC AAGGTGAGCG CCGACAATGT CTCATTAACC GGTGCGGTAA GCCTCGCATC 
2 521 AATGCTGACG GAGATATTTC TCCTGCAACA AGCACAGGGA ATGCCGGAGC CGGGGTGGGG 
2 581 AAGGATCACC GATTCACACC AGTGGAACAC CTTGCTAAGT TTGCATAACG CGCAATTTTA 
2 641 TTTGCTACAA CGCACGCCAG AGGTTGCCCG CAGCCGCGCC ACCCCGTTAT TAGATTTGAT 
27 01 CAAGACAGCG TTGACGCCCC ATCCACCGCA AAAACAGGCG TATGGTGTGA CATTACCCAC 
2 7 61 TTCAGTGCTG TTTATCGCCG GACACGATAC TAATCTGGCA AATCTCGGCG GCGCACTGGA 
2821 GCTCAACTGG ACGCTTCCCG GTGAGCCGGA TAACACGCCG CCAGGTGGTG AACTGGTGTT 
2881 TGAACGCTGG CGTCGGCTAA GCGATAACAG CCAGTGGATT CAGGTTTCGC TGGTCTTCCA 

2 941 GACTTTACAG CAGATGCGTG ATAAAACGCC GCTGTCATTA AATACGCCGC CCGGAGAGGT 
3001 GAAACTGACC CTGGCAGGAT GTGAAGAGCG AAATGCGCAG GGCATGTGTT CGTTGGCAGG 
30 61 TTTTACGC7VA ATCGTGAATG AAGCACGCAT ACCCGCTTGC AGTTTGTAAG GTATAAGGCA 
3121 GTTATTGGTG CCCTTAAACG CCTGGTGCTA CGCCTGAATA AGTGATAATA AGCGGATGAA 
3181 TGGCAGAAAT TCGCCGGATC TTTGTGAAGG AACCTTACTT CTGTGGTGTG ACATAATTGG 
3241 ACAAACTACC TACAGAGATT TAAAGCTCTA AGGTAAATAT AAAATTTTTA AGTGTATAAT 
3301 GTGTTAAACT ACTGATTCTA ATTGTTTGTG TATTTTAGAT TCCAACCTAT GGAACTGATG 

33 61 AATGGGAGCA GTGGTGGAAT GCCTTTAATG AGGAAAACCT GTTTTGCTCA GAAGAAATGC 

34 21 CATCTAGTGA TGATGAGGCT ACTGCTGACT CTCAACATTC TACTCCTCCA Aj=LAAAGAAGA 
34 81 GAAAGGTAGA AGACCCC7VAG GACTTTCCTT CAGAATTGCT AAGTTTTTTG AGTCATGCTG 
3541 TGTTTAGTAA TAGAACTCTT GCTTGCTTTG CTATTTACAC CACAAAGGAA AAAGCTGCAC 

3 601 TGCTATACAA GAAAATTATG GAAAAATATT CTGTAACCTT TATAAGTAGG CATAACAGTT 
3 661 ATAATCATAA CATACTGTTT TTTCTTACTC CACACAGGCA TAGAGTGTCT GCTATTAATA 
37 21 ACTATGCTCA AAAATTGTGT ACCTTTAGCT TTTTAATTTG TAAAGGGGTT AATAAGGAAT 

37 81 ATTTGATGTA TAGTGCCTTG ACTAGAGATC ATAATCAGCC ATACCACATT TGTAGAGGTT 

38 41 TTACTTGCTT TAAAAAACCT CCCACACCTC CCCCTGAACC TGAAACATAA AATGAATGCA 
3 901 ATTGTTGTTG TTAACTTGTT TATTGCAGCT TATAATGGTT ACAAATAAAG CAATAGCATC 

3 9 61 ACAAATTTCA CAAATAAAGC ATTTTTTTCA CTGCATTCTA GTTGTGGTTT GTCCAAACTC 

4 021 ATCAATGTAT CTTATCATGT CTGGATCGAT CCCCGGGTAC CGAGCTCGAA TTCGTAATCA 
4 081 TGGTCATAGC TGTTTCCTGT GTGAAATTGT TATCCGCTCA CAATTCCACA CAACATACGA 
4141 GCCGGAAGCA TAAAGTGT7VA AGCCTGGGGT GCCT7VATGAG TGAGCTAACT CACATTAATT 
4 201 GCGTTGCGCT CACTGCCCGC TTTCCAGTCG GGAAACCTGT CGTGCCAGCT GCATTAATGA 
4 2 61 ATCGGCCAAC GCGCGGGGAG AGGCGGTTTG CGTATTGGGC GCTCTTCCGC TTCCTCGCTC 
4 321 ACTGACTCGC TGCGCTCGGT CGTTCGGCTG CGGCGAGCGG TATCAGCTCA CTCAAAGGCG 
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Figure 18D: 



4381 
4441 
4501 
4561 
4621 
4681 
4741 
4801 
4861 
4921 
4981 
5041 
5101 
5161 
5221 
5281 
5341 
5401 
5461 
5521 
5581 
5641 
5701 
5761 
5821 
5881 
5941 
6001 
6061 
6121 
6181 
6241 
6301 
6361 
6421 
6481 
6541 
6601 
6661 



GTAATACGGT 
CAGCAAAAGG 
CCCCCTGACG 
CTATAAAGAT 
CTGCCGCTTA 
TGCTCACGCT 
CACGAACCCC 
AACCCGGT7VA 
GCGAGGTATG 
AGAAGGACAG 
GGTAGCTCTT 
CAGCAGATTA 
TCTGACGCTC 
AGGATCTTCA 
TATGAGTAAA 
ATCTGTCTAT 
CGGGAGGGCT 
GCTCCAGATT 
GCAACTTTAT 
TCGCCAGTTA 
TCGTCGTTTG 
TCCCCCATGT 
AAGTTGGCCG 
ATGCCATCCG 
TAGTGTATGC 
CATAGCAGAA 
AGGATCTTAC 
TCAGCATCTT 
GCAAAAAAGG 
TATTATTGAA 
TAGAAAAATA 
TAAGAAACCA 
CGTCTCGCGC 
GTCACAGCTT 
GGTGTTGGCG 
GTGCACCATA 
CGCCATTCGC 
CTATTACGCC 
GGGTTTTCCC 



TATCCACAGA 
CCAGGAACCG 
AGCATCACAA 
ACCAGGCGTT 
CCGGATACCT 
GTAGGTATCT 
CCGTTCAGCC 
GACACGACTT 
TAGGCGGTGC 
TATTTGGTAT 
GATCCGGCAA 
CGCGCAGAAA 
AGTGGAACGA 
CCTAGATCCT 
CTTGGTCTGA 
TTCGTTCATC 
TACCATCTGG 
TATCAGCAAT 
CCGCCTCCAT 
ATAGTTTGCG 
GTATGGCTTC 
TGTGCAAAAA 
CAGTGTTATC 
TAAGATGCTT 
GGCGACCGAG 
CTTTAAAAGT 
CGCTGTTGAG 
TTACTTTCAC 
GAATAAGGGC 
GCATTTATCA 
AACAAATAGG 
TTATTATCAT 
GTTTCGGTGA 
GTCTGTAAGC 
GGTGTCGGGG 
TGCGGTGTGA 
CATTCAGGCT 
AGCTGGCGAA 
AGTCACGACG 



ATCAGGGGAT 
TAAAAAGGCC 
AAATCGACGC 
TCCCCCTCCA 
GTCCGCCTTT 
CAGTTCGGTG 
CGACCGCTGC 
ATCGCCACTG 
TACAGAGTTC 
CTGCGCTCTG 
ACAAACCACC 
AAAAGGATCT 
AAACTCACGT 
TTTAAATTAA 
CAGTTACCAA 
CATAGTTGCC 
CCCCAGTGCT 
7VAACCAGCCA 
CCAGTCTATT 
CAACGTTGTT 
ATTCAGCTCC 
AGCGGTTAGC 
ACTCATGGTT 
TTCTGTGACT 
TTGCTCTTGC 
GCTCATCATT 
ATCCAGTTCG 
CAGCGTTTCT 
GACACGGAAA 
GGGTTATTGT 
GGTTCCGCGC 
GACATTAACC 
TGACGGTGAA 
GGATGCCGGG 
CTGGCTTAAC 
AATACCGCAC 
GCGCAACTGT 
AGGGGGATGT 
TTGTAAAACG 



AACGCAGGAA 
GCGTTGCTGG 
TCAAGTCAGA 
AGCTCCCTCG 
CTCCCTTCGG 
TAGGTCGTTC 
GCCTTATCCG 
GCAGCAGCCA 
TTGAAGTGGT 
CTGAAGCCAG 
GCTGGTAGCG 
CAAGAAGATC 
TAAGGGATTT 
AAATGAAGTT 
TGCTTAATCA 
TGACTCCCCG 
GCAATGATAC 
GCCGGAAGGG 
AATTGTTGCC 
GCCATTGCTA 
GGTTCCCAAC 
TCCTTCGGTC 
ATGGCAGCAC 
GGTGAGTACT 
CCGGCGTCAA 
GGAAAACGTT 
ATGTAACCCA 
GGGTGAGCAA 
TGTTGAATAC 
CTCATGAGCG 
ACATTTCCCC 
TATAAAAATA 
AACCTCTGAC 
AGCAGACAAG 
TATGCGGCAT 
AGATGCGTAA 
TGGGAAGGGC 
GCTGCAAGGC 
ACGGCCAGTG 



AGAACATGTG 
CGTTTTTCCA 
GGTGGCGAAA 
TGCGCTCTCC 
GAAGCGTGGC 
GCTCCAAGCT 
GTAACTATCG 
CTGGTAACAG 
GGCCTAACTA 
TTACCTTCGG 
GTGGTTTTTT 
CTTTGATCTT 
TGGTCATGAG 
TTAAATCAAT 
GTGAGGCACC 
TCGTGTAGAT 
CGCGAGACCC 
CCGAGCGCAG 
GGGAAGCTAG 
CAGGCATCGT 
GATCAAGGCG 
CTCCGATCGT 
TGCATAATTC 
C7VACCAAGTC 
TACGGGATAA 
CTTCGGGGCG 
CTCGTGCACC 
AAACAGGAAG 
TCATACTCTT 
GATACATATT 
GAAAAGTGCC 
GGCGTATCAC 
ACATGCAGCT 
CCCGTCAGGG 
CAGAGCAGAT 
GGAGAAAATA 
GATCGGTGCG 
GATTAAGTTG 
CCAAGCTT 



AGCAAAAGGC 
TAGGCTCCGC 
CCCGACAGGA 
TGTTCCGACC 
GCTTTCTCAA 
GGGCTGTGTG 
TCTTGAGTCC 
GATTAGCAGA 
CGGCTACACT 
AAAAAGAGTT 
TGTTTGCAAG 
TTCTACGGGG 
ATTATCAAAA 
CTAAAGTATA 
TATCTCAGCG 
AACTACGATA 
ACGCTCACCG 
AAGTGGTCCT 
AGTAAGTAGT 
GGTGTCACGC 
AGTTACATGA 
TGTCAGAAGT 
TCTTACTGTC 
ATTCTGAGAA 
TACCGCGCCA 
AAAACTCTCA 
CAACTGATCT 
GCAAAATGCC 
CCTTTTTCAA 
TG7VATGTATT 
ACCTGACGTC 
GAGGCCCTTT 
CCCGGAGACG 
CGCGTCAGCG 
TGTACTGAGA 
CCGCATCAGG 
GGCCTCTTCG 
GGTAACGCCA 



// 
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Figure 19: Nucleic acid sequence of the known segment of the R15/appa+intron transgene 
used for the generation of transgenic mice (SEP ID NO: 3). 



LOCUS R15/appa 4060 bp DNA SYN 15-APR-2000 

DEFINITION R15/appa transgene without vector 
ACCESSION Rlb/appa 

REFERENCE 1 (bases 1 to 4060) 
SOURCE synthetic construct. 

ORGANISM synthetic construct 
artificial sequence, 
salivary proline-rich protein, acid glucose-l-phosphatase/ appA 
gene; periplasmic phosphoanhydr ide phosphohydrolase ; artificial 
sequence; 

Golovan, S., Forsberg, C.W., Phillips, J. 
Unpublished . 



KEYWORDS 



AUTHORS 
JOURNAL 



DEFINITION 
ACCESSION 
VERSION 
SOURCE 

ORGANISM 

Mammalia; 

Rattus . 

REFERENCE 
AUTHORS 
TITLE 
encoding 

JOURNAL 
MEDLINE 
FEATURES 

source 



Rat salivary proline-rich protein (RP15) gene. 

M64793 M36414 

M64793 .1 GI : 206711 

Rat (Sprague-Dawley) liver DNA. 

Rattus norvegicus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; 

Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; 

1 (bases 1 to 1748) 
Lin,H.H. and Ann,D.K. 

Molecular characterization of rat multigene family 

proline-rich proteins 
Genomics 10, 102-113 (1991) 
91257817 

Location/Qualifiers 
1..1748 

/organism=="Rattus norvegicus" 
/strain=" Sprague-Dawley" 
/db_xref="taxon: 10116" 
/t is sue_type=" liver " 

/tissue__lib="cosmid genomic library" 
1802-1810 

/function^" consensus sequence for initiation in 
higher eukaryotes " 



misc feature 



FEATURES Location/Qualifiers 

DEFINITION E. coli periplasmic phosphoanhydr ide phosphohydrolase (appA) 
gene, 

ACCESSION M58708 L03370 L03371 L03372 L03373 L03374 L03375 

VERSION M58708.1 GI:145283 

SOURCE Escherichia coli DNA, 

ORGANISM Escherichia coli 

Bacteria; Proteobacteria ; gamma subdivision; 
Enterobacteriaceae ; 

Escherichia. 

REFERENCE 1 (bases 1811. .3109) 

AUTHORS Dassa,J., Marck,C. and Boquet,P.L. 
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Figure 19A: 



TITLE 



JOURNAL 
MEDLINE 



The complete nucleotide sequence of the Escherichia coli 
gene appA reveals significant homology between pH 2 . 5 
acid phosphatase and glucose-1 -phosphatase 

J. Bacterid. 172 (9), 5497-5500 (1990) 

90368616 



FEATURES 

Source 



sig_pept ide 
/gene="appA' 



CDS 



Location/Qualifiers 

1811 . . 3109 
/organism=" Escherichia coli " 
/db_xref="taxon : 562" 
1811 . . 1876 

1811 . , 3109 

/gene="appA" 

/standard_name="acid phosphatase/phytase " 
/transl_table=ll 

/product="periplasmic phosphoanhydride 
phosphohydrolase " 
/protein_id="AAA72086 . 1" 
/db xref="GI : 145285" 



mat_pept ide 



mutation 



mutation 



mutation 



1877 3106 
/gene="appA" 

/product="periplasmic phosphoanhydride 
phosphohydrolase " 

replace (1817 . . 1819, "gcg changed to gcc") 
/gene="appA" 

/standard_name="A3 mutant" 

/note="created by site directed mutagenesis" 

/phenotype="silent mutation" 

replace (3092 3094, " ccg changed to ccc") 

/gene="appA" 

/standard_name=" P428 mutant" 

/note=="created by site directed mutagenesis" 

/phenotype=" silent mutation " 

replace (3095. . 3097, " gcg changed to get") 

/gene="appA" 

/standard_name=" A429 mutant" 

/note="created by site directed mutagenesis" 
/phenotype=" silent mutation " 
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Figure 19B: 



SV40 t intron 3197.. 3810 

/note-"SV40 signals" 
polyA_signal 3807 . . 4047 

/note="SV40 signals" 



BASE COUNT 1257 a 814 c 843 g 1146 t 

ORIGIN 

1 GGATCCCCTT TGCTATGTAG TTTTTAATGG AAATTACAAC CCATAGTGTG TTGATAAATA 
61 GAGAGTCCTG TTTGGTTTAA GCAACCTCTG TTTCTCATAA ACTCCATAAA AACAGGAATA 
121 CTCTTTGTTT CTAGCATAAC CAAAAGATTT AGTGAATTGA AAACAATGTT CCCTTAGAGT 
181 ATAGGTCTAA TAACCCCGAA AATATTACCA TGATACTGAG CATTTGTAAG TATCTCATAG 
241 CATGTAGTAT CCATAGTCCA TCAATGAGAG AGACATTTAA CATGATTTTC ATTAATCAGG 
301 TGGAAAAGAC ATGACAACAT TCACAGGCAC TGCACAGAAC ATAGTGGTCC ACCTTGCACA 
3 61 TATTTCACTA AACTAGGTTT ATCTATTTTG TTGCTTTCTC TAACATCTCT GCAATGAAGC 
421 AGGTCAACAG TGCCACATAT CCTTTACTTA ACCTAAGGAA CACAAAAAAT TTTCTACATA 
481 TATCCTGGTT AGAGAGTGCT TAAAATAAGT TTTCCAAGAA TGGAAAAGAA ATGTTCTGAC 
54 1 TTAACAATTA AGACAGTATT TATTTAAAGC AAGAAATATG AGGCACACAA GAAAATATTT 
601 TGGGAAGAAA CCATTTGGTG AACAATATTT CAAATAAAAA TAGACAAACA TAGTTAATTG 
6 61 TAAAACATAT GTTTGACCAG CCCTTCTTTT CAATAGGCTT AATGTGAATA AAATGTTAAA 
721 GATTCTCTTT GGGTGGCTGC AAATTGTCCA CGAATAAGAC AAAATATAAA AATAAGGACT 
781 GAGTCTCACA AAATGAAAAG GA7LATATATT CAGAAAGAGA ATCTTGAGAG AATGTGTTGT 
84 1 CACAAATTAA AGAAAACCTG TGGTGAATGA CATCCTGAGG CCTGAGCTAT TACTGACATT 
901 TAAGATAAAG GTAACTGTAT ACATTTGTCC CATTGAGGGG ACAAGAAAGC TGCTCTCATG 
9 61 TTCAGCTCTA TAATTCTTGC CTTAAACAAC TTAAATAGAA TGATTTAAAA TATGGAGCTG 
1021 TCCATGGACC TTTGAAATAT AAAATAGTCA AGCAACTTAT CAAGGAATTA CAGATTCCTT 
1081 GATACTAACA CAGGTAAATC CCACACGTGT TTTGAGACTA CATTTGCTGG GATTTTATTG 
1141 ATGTAATAGG TCACATGTTT TTCGGGCCAA TGTTGCTGTT ATTCGGTTAC TTCAAGAGAA 
1201 TAGTGGCAAC TGATGCTATG TATTCTAGGG GTTTGAAGTG ATGTTTCATG ATTGAAATTT 
12 61 GTAAAAGAAT AACATCATCA TTCTTAACAA TAGAACATAT AAAGTCACAC AGAAGTGACA 
1321 GTGTTTAAGC TGTACTATTG ATCAAAGAAA TTTATTACCT TCAGTTTCAA TGGAAATAAT 
1381 TACTGATAAT ACAAACATGT GTGAACACAC ACTAATCCTA TCCAAATGCA CAGTGATACA 
1 4 4 1 CAGAAAATAT TAGCAAGTAG AATGCAATAT TTATATAACG ATTGTATTTA TCAATCAATT 
1501 GTATGTATCA ATATATGGGC TATTTTCTTA CACATGATTT TATTCAAATT TACTCTAATC 
15 61 ATTGTTGAAC CATTTAGAAA AGGCATACTG GCAACTTTTC CTTACCTCAT CCAGCTGGGC 
1621 AAAAGTCCCA GTGTGGAGTA AAGGATGCAA GATTTCCTGC TCTGTTAAGT ATAAAATAAT 
1681 AGTATGAATT CAAAGGTGCC ATTCTTCTGC TTCTAGTTAT AAAGGCAGTG CTTGCTTCTT 

17 41 CCAGCACAGA TCTGGATCTC GAGGAGCTTG GCGAGATTTT CAGGAGCTAA GGAAGCTAAA 
1801 AGCCGCCACC ATGAAAGCCA TCTTAATCCC ATTTTTATCT CTTCTGATTC CGTTAACCCC 

18 61 GCAATCTGCA TTCGCTCAGA GTGAGCCGGA GCTGAAGCTG GAAAGTGTGG TGATTGTCAG 
1921 TCGTCATGGT GTGCGTGCTC CAACCAAGGC CACGCAACTG ATGCAGGATG TCACCCCAGA 
1981 CGCATGGCCA ACCTGGCCGG TAAAACTGGG TTGGCTGACA CCGCGCGGTG GTGAGCTAAT 
2041 CGCCTATCTC GGACATTACC AACGCCAGCG TCTGGTAGCC GACGGATTGC TGGCGAAAAA 
2101 GGGCTGCCCG CAGTCTGGTC AGGTCGCGAT TATTGCTGAT GTCGACGAGC GTACCCGTAA 
2161 AACAGGCGAA GCCTTCGCCG CCGGGCTGGC ACCTGACTGT GCAATAACCG TACATACCCA 
2221 GGCAGATACG TCCAGTCCCG ATCCGTTATT TAATCCTCTA AAAACTGGCG TTTGCCAACT 
22 81 GGATAACGCG AACGTGACTG ACGCGATCCT CAGCAGGGCA GGAGGGTCAA TTGCTGACTT 
2341 TACCGGGCAT CGGCAAACGG CGTTTCGCGA ACTGGAACGG GTGCTTAATT TTCCGCAATC 
24 01 AAACTTGTGC CTTAAACGTG AGAAACAGGA CGAAAGCTGT TCATTAACGC AGGCATTACC 
2 4 61 ATCGGAACTC AAGGTGAGCG CCGACAATGT CTCATTAACC GGTGCGGTAA GCCTCGCATC 
2 521 7VATGCTGACG GAGATATTTC TCCTGCAACA AGCACAGGGA ATGCCGGAGC CGGGGTGGGG 
2581 AAGGATCACC GATTCACACC AGTGGAACAC CTTGCTAAGT TTGCATAACG CGCAATTTTA 
2 641 TTTGCTACAA CGCACGCCAG AGGTTGCCCG CAGCCGCGCC ACCCCGTTAT TAGATTTGAT 

27 01 CAAGACAGCG TTGACGCCCC ATCCACCGCA AAAACAGGCG TATGGTGTGA CATTACCCAC 
2 7 61 TTCAGTGCTG TTTATCGCCG GACACGATAC TAATCTGGCA AATCTCGGCG GCGCACTGGA 
2821 GCTCAACTGG ACGCTTCCCG GTGAGCCGGA TAACACGCCG CCAGGTGGTG AACTGGTGTT 

28 81 TGAACGCTGG CGTCGGCTAA GCGATAACAG CCAGTGGATT CAGGTTTCGC TGGTCTTCCA 
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Figure 19C: 

2 941 GACTTTACAG CAGATGCGTG ATAAAACGCC 
3001 GAAACTGACC CTGGCAGGAT GTGAAGAGCG 
30 61 TTTTACGCAA ATCGTGAATG AAGCACGCAT 
3121 GTTATTCCTC CCCTTAAACG CCTGGTGCTA 
3181 TGGCAGAAAT TCGCCGGATC TTTGTGAAGG 
32 41 ACAAACTACC TACAGAGATT TAAAGCTCTA 
3301 GTGTTAAACT ACTGATTCTA ATTGTTTGTG 

3 3 61 AATGGGAGCA GTGGTGGAAT GCCTTTAATG 
34 21 CATCTAGTGA TGATGAGGCT ACTGCTGACT 

34 81 GAAAGGTAGA AGACCCCAAG GACTTTCCTT 

35 41 TGTTTAGTAA TAGAACTCTT GCTTGCTTTG 
3 601 TGCTATACAA GAAAATTATG GAAAAATATT 
3 6 61 ATAATCATAA CATACTGTTT TTTCTTACTC 
37 21 ACTATGCTCA AAAATTGTGT ACCTTTAGCT 

37 81 ATTTGATGTA TAGTGCCTTG ACTAGAGATC 

38 41 TTACTTGCTT TA7VAAAACCT CCCACACCTC 
3 901 ATTGTTGTTG TTAACTTGTT TATTGCAGCT 

3 9 61 ACAAATTTCA CT^AATAAAGC ATTTTTTTCA 

4 021 ATCAATGTAT CTTATCATGT CTGGATCGAT 



GCTGTCATTA AATACGCCGC CCGGAGAGGT 
AAATGCGCAG GGCATGTGTT CGTTGGCAGG 
ACCCGCTTGC AGTTTGTAAG GTATAAGGCA 
CGCCTGAATA AGTGATAATA AGCGGATGAA 
AACCTTACTT CTGTGGTGTG ACATAATTGG 
AGGTAAATAT AAAATTTTTA AGTGTATAAT 
TATTTTAGAT TCCAACCTAT GGAACTGATG 
AGGAAAACCT GTTTTGCTCA GAAGAAATGC 
CTCAACATTC TACTCCTCCA AAAAAGAAGA 
CAGAATTGCT AAGTTTTTTG AGTCATGCTG 
CTATTTACAC CACAAAGGAA AAAGCTGCAC 
CTGTAACCTT TATAAGTAGG CATAACAGTT 
CACACAGGCA TAGAGTGTCT GCTATTAATA 
TTTTAATTTG TAAAGGGGTT AATAAGGAAT 
ATAATCAGCC ATACCACATT TGTAGAGGTT 
CCCCTGAACC TGAAACATAA AATGAATGCA 
TATAATGGTT ACAAATAAAG CAATAGCATC 
CTGCATTCTA GTTGTGGTTT GTCCAAACTC 
CCCCGGGTAC 
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Figure 20: Nucleic acid sequenc e of the known segment of the R15/appa plasmid including 
the vector sequences of pBLCAT3 (SEP ID NO:4y 



LOCUS 

DEFINITION 
ACCESSION 
REFERENCE 
SOURCE 

ORGANISM 
KEYWORDS 



15-APR-2000 



AUTHORS 
JOURNAL 



R15/appa 6116 bp DNA SYN 

R15/appa transgene with vector 
R15/appa 

1 (bases 1 to 6116) 
synthetic construct. 

synthetic construct 
artificial sequence, 
salivary proline-rich protein, acid glucose- 1-phosphatase ; appA 
gene; periplasmic phosphoanhydride phosphohydrolase ; artificial 
sequence ; 

Golovan, S., Forsberg, C.W., Phillips, J. 
Unpublished . 



DEFINITION Rat salivary proline-rich protein (RP15) gene. 
ACCESSION M64793 M36414 
VERSION M64793.1 GI:206711 

SOURCE Rat (Sprague-Dawley) liver DNA. 

ORGANISM Rattus norvegicus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; 



Mammalia ; 

Rattus . 

REFERENCE 
AUTHORS 
TITLE 

encoding 

JOURNAL 
MEDLINE 
FEATURES 

source 



Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; 

1 (bases 1 to 1748) 
Lin,H.H. and Ann,D.K. 

Molecular characterization of rat multigene family 

proline-rich proteins 
Genomics 10, 102-113 (1991) 
91257817 

Location/Qualifiers 
1 . . 1748 

/organism="Rattus norvegicus" 
/strain=" Sprague-Dawley" 
/db_xref="taxon: 10116" 
/t is sue_type=" liver" 

/tissue_lib="cosmid genomic library" 
misc_feature 1802-1810 

/function==" consensus sequence for initiation in 
higher eukaryotes " 



FEATURES 

DEFINITION 

gene. 



Location/Qualifiers 
E, coli periplasmic phosphoanhydride phosphohydrolase (appA) 



ACCESSION M58708 L03370 L03371 L03372 L03373 L03374 L03375 

VERSION M58708.1 01:145283 

SOURCE Escherichia coli DNA. 

ORGANISM Escherichia coli 

Bacteria; Proteobacteria ; gamma subdivision; Enterobacteriaceae ; 

Escherichia . 



REFERENCE 



AUTHORS 
TITLE 



JOURNAL 



(bases 1811..3109) 

Dassa,J., Marck,C. and Boquet,P.L. 

The complete nucleotide sequence of the Escherichia coli gene appA 
reveals significant homology between pH 2.5 acid phosphatase 
and glucose- 1-phosphatase 

J. Bacteriol. 172 (9), 5497-5500 (1990) 
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Figure 20A: 

MEDLINE 90368616 



FEATURES Location /Qua 1 i fiers 

Source 1811 . . 3109 

/organism="Escher ichia coli" 
/db_xref="taxon: 562" 
3ig_peptide 1811.. 1876 

/gene="appA" 

CDS 1811. .3109 



/gene="appA" 
/s tandard_name = "acid phosphatase/phytase " 
/t ransl_table=ll 

/product="periplasmic phosphoanhydride phosphohydrolas 
/protein_id=="AAA72 08 6, 1" 
/db_xref="GI : 145285" 

mat_peptide 1877 3106 

/gene="appA" 

/product="periplasmic phosphoanhydride phosphohydrolas 

mutation replace ( 18 17 . . 1819, "gcg changed to gcc") 

/gene="appA" 

/s tandard_name-"A3 mutant" 

/note="created by site directed mutagenesis" 
/phenotype="silent mutation" 
mutation replace ( 3092 .. 30 94 , " ccg changed to ccc") 

/gene="appA" 

/standard_name=" P428 mutant" 

/note="created by site directed mutagenesis" 
/phenotype=" silent mutation " 
mutation replace ( 3095 .. i3097 , " gcg changed to get") 

/gene="appA" 

/standard_name=" A429 mutant" 

/note="created by site directed mutagenesis" 
/phenotype=" silent mutation " 



DEFINITION Plasmid pBLCAT3 (bases 3109 to 6116) 
ACCESSION X64409 
VERSION X64409.1 GI:58163 

SOURCE synthetic construct. 

ORGANISM synthetic construct 
artificial sequence. 
REFERENCE 1 (bases 3109 to 6116) 
AUTHORS Luckow, B , H . R . 
TITLE Direct Submission 

JOURNAL Submitted {06-FEB-1992 ) B.H.R. Luckow, German Cancer Re£ 
Center, Im Neuenheimer Feld 280, W-6900 Heidelberg, FRG 
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Figure 20B: 



REFERENCE 
AUTHORS 
TITLE 

for 

regulatory 

JOURNAL 
MEDLINE 
COMMENT 
experiments 



FEATURES 

source 



polyA_signal 



2 (bases 3109 to 6116) 
Luckow^B. and Schutz,G. 

CAT constructions with multiple unique restriction sites 
the functional analysis of eukaryotic promoters and 
elements 

Nucleic Acids Res. 15 (13), 5490 (1987) 
87260024 

Promoterless CAT vector for transient transfection 

with eukaryotic cells. Allows the analysis of foreign 
promoters and enhancers . 

Location/Qualifiers 

3109 to 6116 

/organism^ "synthetic construct" 
/db_xref="taxon: 32 630" 
3262 . . 3457 

/note="SV40 signals" 



BASE COUNT 
ORIGIN 

1 
61 
121 
181 
241 
301 
361 
421 
481 
541 
601 
661 
721 
781 
841 
901 
961 
1021 
1081 
1141 
1201 
1261 
1321 
1381 
1441 
1501 
1561 
1621 



CDS complement (4654 . . 5514 ) 

/codon_s tart=l 
/trans l_table=ll 
/gene="Amp" 

/product= "beta- lactamase ' 
/protein_id="CAA45753 . 1" 
/db_xref="GI: 58165" 

1724 a 1386 c 1407 g 1599 t 



GGATCCCCTT 
GAGAGTCCTG 
CTCTTTGTTT 
ATAGGTCTAA 
CATGTAGTAT 
TGGAAAAGAC 
TATTTCACTA 
AGGTCAACAG 
TATCCTGGTT 
TTAACAATTA 
TGGGAAGAAA 
TAAAACATAT 
GATTCTCTTT 
GAGTCTCACA 
CACAAATTAA 
TAAGATAAAG 
TTCAGCTCTA 
TCCATGGACC 
GATACTAACA 
ATGT7VATAGG 
TAGTGGCAAC 
GTAAAAGAAT 
GTGTTTAAGC 
TACTGATAAT 
CAGAAAATAT 
GTATGTATCA 
ATTGTTGAAC 
AAAAGTCCCA 



TGCTATGTAG 
TTTGGTTTAA 
CTAGCATAAC 
TAACCCCGAA 
CCATAGTCCA 
ATGACAACAT 
AACTAGGTTT 
TGCCACATAT 
AGAGAGTGCT 
AGACAGTATT 
CCATTTGGTG 
GTTTGACCAG 
GGGTGGCTGC 
AAATGAAAAG 
AGAAAACCTG 
GTAACTGTAT 
TAATTCTTGC 
TTTGAAATAT 
CAGGTAAATC 
TCACATGTTT 
TGATGCTATG 
AACATCATCA 
TGTACTATTG 
ACAAACATGT 
TAGCAAGTAG 
ATATATGGGC 
CATTTAGAAA 
GTGTGGAGTA 



TTTTTAATGG 
GCAACCTCTG 
CAAAAGATTT 
AATATTACCA 
TCAATGAGAG 
TCACAGGCAC 
ATCTATTTTG 
CCTTTACTTA 
TAAAATAAGT 
TATTTAAAGC 
AACAATATTT 
CCCTTCTTTT 
AAATTGTCCA 
GAAATATATT 
TGGTGAATGA 
ACATTTGTCC 
CTTAAACAAC 
AAAATAGTCA 
CCACACGTGT 
TTCGGGCCAA 
TATTCTAGGG 
TTCTTAACAA 
ATCAAAGAAA 
GTGAACACAC 
AATGCAATAT 
TATTTTCTTA 
AGGCATACTG 
AAGGATGCAA 
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AAATTACAAC 
TTTCTCATAA 
AGTGAATTGA 
TGATACTGAG 
AGACATTTAA 
TGCACAGAAC 
TTGCTTTCTC 
ACCTAAGGAA 
TTTCCAAGAA 
AAGAAATATG 
CAAATAAAAA 
C7VATAGGCTT 
CGAATAAGAC 
CAGAAAGAGA 
CATCCTGAGG 
CATTGAGGGG 
TTAAATAGAA 
AGCAACTTAT 
TTTGAGACTA 
TGTTGCTGTT 
GTTTGAAGTG 
TAGAACATAT 
TTTATTACCT 
ACTAATCCTA 
TTATAT7U\CG 
CACATGATTT 
GCAACTTTTC 
GATTTCCTGC 



CCATAGTGTG 
ACTCCATAAA 
AAACAATGTT 
CATTTGTAAG 
CATGATTTTC 
ATAGTGGTCC 
TAACATCTCT 
CACAAAAAAT 
TGGAAAAGAA 
AGGCACACAA 
TAGACAAACA 
AATGTGAATA 
AAAATATAAA 
ATCTTGAGAG 
CCTGAGCTAT 
ACAAGAAAGC 
TGATTTAAAA 
CAAGGAATTA 
CATTTGCTGG 
ATTCGGTTAC 
ATGTTTCATG 
AAAGTCACAC 
TCAGTTTCAA 
TCCAAATGCA 
ATTGTATTTA 
TATTCAAATT 
CTTACCTCAT 
TCTGTTAAGT 



TTGATAAATA 
AACAGGAATA 
CCCTTAGAGT 
TATCTCATAG 
ATTAATCAGG 
ACCTTGCACA 
GCAATGAAGC 
TTTCTACATA 
ATGTTCTGAC 
GAAAATATTT 
TAGTTAATTG 
AAATGTTAAA 
AATAAGGACT 
AATGTGTTGT 
TACTGACATT 
TGCTCTCATG 
TATGGAGCTG 
CAGATTCCTT 
GATTTTATTG 
TTCAAGAGAA 
ATTGAAATTT 
AGAAGTGACA 
TGGAAATAAT 
CAGTGATACA 
TCAATCAATT 
TACTCTAATC 
CCAGCTGGGC 
ATAAAATAAT 



Figure 20C: 



1681 AGTATGAATT CAAAGGTGCC ATTCTTCTGC TTCTAGTTAT AAAGGCAGTG CTTGCTTCTT 

17 41 CCAGCACAGA TCTGGATCTC GAGGAGCTTG GCGAGATTTT CAGGAGCTAA GGAAGCTAAA 
1801 AGCCGCCACC ATGAAAGCCA TCTTAATCCC ATTTTTATCT CTTCTGATTC CGTTAACCCC 

18 61 GCAATCTGCA TTCGCTCAGA GTGAGCCGCA CCTCAACCTG GAAAGTGTGG TGATTGTCAG 
1921 TCGTCATGGT GTGCGTGCTC CAACCAAGGC CACGCAACTG ATGCAGGATG TCACCCCAGA 
1981 CGCATGGCCA ACCTGGCCGG TAAAACTGGG TTGGCTGACA CCGCGCGGTG GTGAGCTAAT 
2 041 CGCCTATCTC GGACATTACC AACGCCAGCG TCTGGTAGCC GACGGATTGC TGGCGAAAAA 
2101 GGGCTGCCCG CAGTCTGGTC AGGTCGCGAT TATTGCTGAT GTCGACGAGC GTACCCGTAA 
2161 AACAGGCGAA GCCTTCGCCG CCGGGCTGGC ACCTGACTGT GCAATAACCG TACATACCCA 
2221 GGCAGATACG TCCAGTCCCG ATCCGTTATT TAATCCTCTA AAAACTGGCG TTTGCCAACT 
22 81 GGATAACGCG AACGTGACTG ACGCGATCCT CAGCAGGGCA GGAGGGTCAA TTGCTGACTT 
2 341 TACCGGGCAT CGGCAAACGG CGTTTCGCGA ACTGGAACGG GTGCTTAATT TTCCGCAATC 
2 4 01 AAACTTGTGC CTTAAACGTG AGAAACAGGA CGAAAGCTGT TCATTAACGC AGGCATTACC 
2 4 61 ATCGGAACTC AAGGTGAGCG CCGACAATGT CTCATTAACC GGTGCGGTAA GCCTCGCATC 
2 521 AATGCTGACG GAGATATTTC TCCTGCAACA AGCACAGGGA ATGCCGGAGC CGGGGTGGGG 
2 581 AAGGATCACC GATTCACACC AGTGGAACAC CTTGCTAAGT TTGCATAACG CGCAATTTTA 
2 641 TTTGCTACAA CGCACGCCAG AGGTTGCCCG CAGCCGCGCC ACCCCGTTAT TAGATTTGAT 
27 01 CAAGACAGCG TTGACGCCCC ATCCACCGCA AAAACAGGCG TATGGTGTGA CATTACCCAC 

27 61 TTCAGTGCTG TTTATCGCCG ' GACACGATAC TAATCTGGCA AATCTCGGCG GCGCACTGGA 
2 821 GCTCAACTGG ACGCTTCCCG GTCAGCCGGA TAACACGCCG CCAGGTGGTG AACTGGTGTT 

28 81 TGAACGCTGG CGTCGGCTAA GCGATAACAG CCAGTGGATT CAGGTTTCGC TGGTCTTCCA 

2 941 GACTTTACAG CAGATGCGTG ATAAAACGCC GCTGTCATTA AATACGCCGC CCGGAGAGGT 
3001 GAAACTGACC CTGGCAGGAT GTGAAGAGCG AAATGCGCAG GGCATGTGTT CGTTGGCAGG 
3061 TTTTACGCAA ATCGTGAATG AAGCACGCAT ACCCGCTTGC AGTTTGTAAG GTATAAGGCA 
3121 GTTATTGGTG CCCTTAAACG CCTGGTGCTA CGCCTGAATA AGTGATAATA AGCGGATGAA 
3181 TGGCAGAAAT TCGCCGGATC TTTGTGAAGG AACCTTACTT CTGTGGTGTG ACATAATTGG 

32 41 ACAAACTACC TACAGAGATT TAAAAAACCT CCCACACCTC CCCCTGAACC TGAAACATAA 
3301 AATGAATGCA ATTGTTGTTG TTAACTTGTT TATTGCAGCT TATAATGGTT ACAAATAAAG 

33 61 CAATAGCATC ACAAATTTCA CAAATAAAGC ATTTTTTTCA CTGCATTCTA GTTGTGGTTT 

34 21 GTCCAAACTC ATCAATGTAT CTTATCATGT CTGGATCGAT CCCCGGGTAC CGAGCTCGAA 
3481 TTCGTAATCA TGGTCATAGC TGTTTCCTGT GTGAAATTGT TATCCGCTCA CAATTCCACA 
3541 CAACATACGA GCCGGAAGCA TAAAGTGTAA AGCCTGGGGT GCCTAATGAG TGAGCTAACT 

3 601 CACATTAATT GCGTTGCGCT CACTGCCCGC TTTCCAGTCG GGAAACCTGT CGTGCCAGCT 
3 661 GCATTAATGA ATCGGCCAAC GCGCGGGGAG AGGCGGTTTG CGTATTGGGC GCTCTTCCGC 
3721 TTCCTCGCTC ACTGACTCGC TGCGCTCGGT CGTTCGGCTG CGGCGAGCGG TATCAGCTCA 

37 81 CTCAAAGGCG GTAATACGGT TATCCACAGA ATCAGGGGAT AACGCAGGAA AGAACATGTG 

38 41 AGCAAAAGGC CAGCAAAAGG CCAGGAACCG TAAAAAGGCC GCGTTGCTGG CGTTTTTCCA 

3 901 TAGGCTCCGC CCCCCTGACG AGCATCACAA AAATCGACGC TCAAGTCAGA GGTGGCGAAA 
3961 CCCGACAGGA CTATAAAGAT ACCAGGCGTT TCCCCCTGGA AGCTCCCTCG TGCGCTCTCC 
4021 TGTTCCGACC CTGCCGCTTA CCGGATACCT GTCCGCCTTT CTCCCTTCGG GAAGCGTGGC 
4081 GCTTTCTCAA TGCTCACGCT GTAGGTATCT CAGTTCGGTG TAGGTCGTTC GCTCCAAGCT 
4141 GGGCTGTGTG CACGAACCCC CCGTTCAGCC CGACCGCTGC GCCTTATCCG GTAACTATCG 
4201 TCTTGAGTCC AACCCGGTAA GACACGACTT ATCGCCACTG GCAGCAGCCA CTGGTAACAG 

4 2 61 GATTAGCAGA GCGAGGTATG TAGGCGGTGC TACAGAGTTC TTGAAGTGGT GGCCTAACTA 
4 321 CGGCTACACT AGAAGGACAG TATTTGGTAT CTGCGCTCTG CTGAAGCCAG TTACCTTCGG 
4 381 AAAAAGAGTT GGTAGCTCTT GATCCGGCAA ACAAACCACC GCTGGTAGCG GTGGTTTTTT 
44 41 TGTTTGCAAG CAGCAGATTA CGCGCAGAAA AAAAGGATCT CAAGAAGATC CTTTGATCTT 
4 501 TTCTACGGGG TCTGACGCTC AGTGGAACGA AAACTCACGT TAAGGGATTT TGGTCATGAG 
4 5 61 ATTATCAAAA AGGATCTTCA CCTAGATCCT TTTAAATTAA AAATGAAGTT TTAAATCAAT 
4 621 CTAAAGTATA TATGAGTAAA CTTGGTCTGA CAGTTACCAA TGCTTAATCA GTGAGGCACC 
4 681 TATCTCAGCG ATCTGTCTAT TTCGTTCATC CATAGTTGCC TGACTCCCCG TCGTGTAGAT 
47 41 AACTACGATA CGGGAGGGCT TACCATCTGG CCCCAGTGCT GCAATGATAC CGCGAGACCC 
4 801 ACGCTCACCG GCTCCAGATT TATCAGCAAT AAACCAGCCA GCCGGAAGGG CCGAGCGCAG 
4 8 61 AAGTGGTCCT GC7VACTTTAT CCGCCTCCAT CCAGTCTATT AATTGTTGCC GGGAAGCTAG 
4 921 AGTAAGTAGT TCGCCAGTTA ATAGTTTGCG CAACGTTGTT GCCATTGCTA CAGGCATCGT 
4 981 GGTGTCACGC TCGTCGTTTG GTATGGCTTC ATTCAGCTCC GGTTCCCAAC GATCAAGGCG 
5041 AGTTACATGA TCCCCCATGT TGTGCAAAAA AGCGGTTAGC TCCTTCGGTC CTCCGATCGT 
5101 TGTCAGAAGT AAGTTGGCCG CAGTGTTATC ACTCATGGTT ATGGCAGCAC TGCATAATTC 
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Figure 20D: 



5161 TCTTACTGTC ATGCCATCCG TAAGATGCTT TTCTGTGACT GGTGAGTACT CAACCAAGTC 

5221 ATTCTGAGAA TAGTGTATGC GGCGACCGAG TTGCTCTTGC CCGGCGTCAA TACGGGATAA 

52 81 TACCGCGCCA CATAGCAGAA CTTTAAAAGT GCTCATCATT GGAAAACGTT CTTCGGGGCG 

53 4 1 AAAACTCTCA AGGATCTTAC CGCTGTTGAG ATCCAGTTCG ATGTAACCCA CTCGTCCACC 

54 01 CAACTGATCT TCAGCATCTT TTACTTTCAC CAGCGTTTCT GGGTGAGCAA AAACAGGAAG 
54 61 GC7UVAATGCC GCAAAAAAGG GAATAAGGGC GACACGGAAA TGTTGAATAC TCATACTCTT 
5 521 CCTTTTTCAA TATTATTGAA GCATTTATCA GGGTTATTGT CTCATGAGCG GATACATATT 
5 581 TGAATGTATT TAGAAAAATA AACAAATAGG GGTTCCGCGC ACATTTCCCC GAAAAGTGCC 
5641 ACCTGACGTC TAAGAAACCA TTATTATCAT GACATTAACC TATAA7VAATA GGCGTATCAC 
5701 GAGGCCCTTT CGTCTCGCGC GTTTCGGTGA TGACGGTGAA AACCTCTGAC ACATGCAGCT 

57 61 CCCGGAGACG GTCACAGCTT GTCTGTAAGC GGATGCCGGG AGCAGACAAG CCCGTCAGGG 

58 21 CGCGTCAGCG GGTGTTGGCG GGTGTCGGGG CTGGCTTAAC TATGCGGCAT CAGAGCAGAT 
58 81 TGTACTGAGA GTGCACCATA TGCGGTGTGA AATACCGCAC AGATGCGTAA GGAGAAAATA 
5 941 CCGCATCAGG CGCCATTCGC CATTCAGGCT GCGCAACTGT TGGGAAGGGC GATCGGTGCG 
6001 GGCCTCTTCG CTATTACGCC AGCTGGCGAA AGGGGGATGT GCTGCAAGGC GATTAAGTTG 
60 61 GGTAACGCCA GGGTTTTCCC AGTCACGACG TTGTAAAACG ACGGCCAGTG CCAAGC 



// 
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Fi gure 21: Nucleic acid sequence of the known segment of the R15/appa transgene used for 
the generation of transgenic mice (SEP ID NO:5y 



LOCUS R15/appa 3470 bp DNA SYN 15-APR-2000 

DEFINITION R15/appa transgene with vector sequences removed. 

ACCESSION R15/appa 

REFERENCE 1 (bases 1 to 3470} 

SOURCE synthetic construct. 

ORGANISM synthetic construct 
artificial sequence. 
KEYWORDS salivary proline-rich protein, acid glucose-l-phosphatase; appA 

gene; periplasmic phosphoanhydr ide phosphohydrolase ; artificial 

sequence ; 

AUTHORS Golovan, S., Forsberg, C.W., Phillips, J. 

JOURNAL Unpublished. 

DEFINITION Rat salivary proline-rich protein {RP15) gene. 
ACCESSION M64793 M36414 
VERSION M64793.1 01:206711 

SOURCE Rat (Sprague-Dawley) liver DNA. 

ORGANISM Rattus norvegicus 

Eukaryota; Metazoa; Chordata/ Craniata; Vertebrata; 

Mammal ia ; 

Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; 

Rattus . 

REFERENCE 1 (bases 1 to 1748) 
AUTHORS Lin,H.H. andAnn,D.K. 

TITLE Molecular characterization of rat multigene family 

encoding 

proline-rich proteins 
JOURNAL Genomics 10, 102-113 (1991) 
MEDLINE 91257817 
FEATURES Location/Qualifiers 
source 1 . .17 4 8 

/organism="Rattus norvegicus" 
/ strain= "Sprague-Dawley" 
/db_xref="taxon: 10116" 
/tissue_type=" liver " 

/tissue_lib=="cosmid genomic library" 
misc_feature 1802-1810 

/function=" consensus sequence for initiation in 
higher eukaryotes " 



FEATURES Location/Qualifiers 

DEFINITION E. coli periplasmic phosphoanhydr ide phosphohydrolase (appA) 
gene, 

ACCESSION M58708 L03370 L03371 L03372 L03373 L03374 L03375 

VERSION M58708.1 GI: 145283 

SOURCE Escherichia coli DNA. 

ORGANISM Escherichia coli 

Bacteria; Proteobacteria ; gamma subdivision; Enterobacteriaceae ; 

Escherichia . 

REFERENCE 1 (bases 1811. .3109) 

AUTHORS Dassa,J., Marck,C. and Boquet,P.L. 

TITLE The complete nucleotide sequence of the Escherichia coli gene appA 

reveals significant homology between pH 2.5 acid phosphatase 
and glucose-l-phosphatase 
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Figure 21A: 



JOURNAL J, Bacterid . 172 (9), 5497-5500 (1990) 
MEDLINE 90368616 

FEATURES Location /Quali fie rs 

Source 1811 . . 3109 

/organism="Escher ichia coli" 

/db_xref="taxon : 562" 
sig_peptide 1811.. 1876 

/gene="appA" 
CDS 1811 . . 3109 

/gene="appA" 

/standard_name="acid phosphatase /phytase " 
/transl_table=ll 

/product="periplasmic phosphoanhydride phosphohydrolase* 
/protein_id="AAA72086. 1" 
/db_xref="GI : 145285" 

mat_peptide 1877 3106 

/gene="appA" 

/product="periplasmic phosphoanhydride phosphohydrolase' 

mutation replace ( 18 17 . . 1819, "gcg changed to gcc") 

/gene="appA" 

/standard_name="A3 mutant" 

/note="created by site directed mutagenesis" 
/phenotype="silent mutation" 
mutation replace ( 3092 30 94 , " ccg changed to ccc") 

/gene="appA" 

/standard_name=" P428 mutant" 

/note="created by site directed mutagenesis" 
/phenotype=" silent mutation " 
mutation replace ( 3095 .. 30 97 , " gcg changed to get") 

/gene="appA" 

/standard_name=" A429 mutant" 

/note="created by site directed mutagenesis" 
/phenotype=" silent mutation " 



polyA_signal 3262 , . 34.57 

/note="SV4 0 signals" 

BASE COUNT 1065 a 721 c 735 g 949 t 
ORIGIN 

1 GGATCCCCTT TGCTATGTAG TTTTTAATGG AAATTACT^C CCATAGTGTG TTGATAAATA 

61 GAGAGTCCTG TTTGGTTTAA GCAACCTCTG TTTCTCATAA ACTCCATAAA AACAGGAATA 

121 CTCTTTGTTT CTAGCATAAC CAAAAGATTT AGTGAATTGA AAACT^TGTT CCCTTAGAGT 

181 ATAGGTCTAA TAACCCCGAA AATATTACCA TGATACTGAG CATTTGTAAG TATCTCATAG 

241 CATGTAGTAT CCATAGTCCA TCAATGAGAG AGACATTTAA CATGATTTTC ATTAATCAGG 
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Figure 21B: 
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3421 



TGGAAAAGAC 
TATTTCACTA 
AGGTCAACAG 
TATCCTGGTT 
TTAACAATTA 
TGGGAAGAAA 
TAAAACATAT 
GATTCTCTTT 
GAGTCTCACA 
CACAAATTAA 
TAAGATAAAG 
TTCAGCTCTA 
TCCATGGACC 
GATACTAACA 
ATGTAATAGG 
TAGTGGCAAC 
GTAAAAGAAT 
GTGTTTAAGC 
TACTGATAAT 
CAGAAAATAT 
GTATGTATCA 
ATTGTTGAAC 
AAAAGTCCCA 
AGTATGAATT 
CCAGCACAGA 
AGCCGCCACC 
GCAATCTGCA 
TCGTCATGGT 
CGCATGGCCA 
CGCCTATCTC 
GGGCTGCCCG 
AACAGGCGAA 
GGCAGATACG 
GGATAACGCG 
TACCGGGCAT 
AAACTTGTGC 
ATCGGAACTC 
AATGCTGACG 
AAGGATCACC 
TTTGCTACAA 
CAAGACAGCG 
TTCAGTGCTG 
GCTCAACTGG 
TG7VACGCTGG 
GACTTTACAG 
GAAACTGACC 
TTTTACGCAA 
GTTATTGGTG 
TGGCAGAAAT 
ACAAACTACC 
AATGAATGCA 
CAATAGCATC 
GTCCAAACTC 



ATGACAACAT 
AACTAGGTTT 
TGCCACATAT 
AGAGAGTGCT 
AGACAGTATT 
CCATTTGGTG 
GTTTGACCAG 
GGGTGGCTGC 
AAATGAAAAG 
AGAAAACCTG 
GTAACTGTAT 
TAATTCTTGC 
TTTGAAATAT 
CAGGTAAATC 
TCACATGTTT 
TGATGCTATG 
AACATCATCA 
TGTACTATTG 
ACAAACATGT 
TAGCAAGTAG 
ATATATGGGC 
CATTTAGAAA 
GTGTGGAGTA 
CAAAGGTGCC 
TCTGGATCTC 
ATGAAAGCCA 
TTCGCTCAGA 
GTGCGTGCTC 
ACCTGGCCGG 
GGACATTACC 
CAGTCTGGTC 
GCCTTCGCCG 
TCCAGTCCCG 
AACGTGACTG 
CGGCAAACGG 
CTTAAACGTG 
AAGGTGAGCG 
GAGATATTTC 
GATTCACACC 
CGCACGCCAG 
TTGACGCCCC 
TTTATCGCCG 
ACGCTTCCCG 
CGTCGGCTAA 
CAGATGCGTG 
CTGGCAGGAT 
ATCGTGAATG 
CCCTTAAACG 
TCGCCGGATC 
TACAGAGATT 
ATTGTTGTTG 
ACA7VATTTCA 
ATCAATGTAT 



TCACAGGCAC 
ATCTATTTTG 
CCTTTACTTA 
TAAAATAAGT 
TATTTAAAGC 
AACAATATTT 
CCCTTCTTTT 
AAATTGTCCA 
GAAATATATT 
TGGTGAATGA 
ACATTTGTCC 
CTTAAACAAC 
AAAATAGTCA 
CCACACGTGT 
TTCGGGCCAA 
TATTCTAGGG 
TTCTTAACAA 
ATC7VAAGAAA 
GTGAACACAC 
AATGCAATAT 
TATTTTCTTA 
AGGCATACTG 
AAGGATGCAA 
ATTCTTCTGC 
GAGGAGCTTG 
TCTTAATCCC 
GTGAGCCGGA 
CAACCAAGGC 
TAAAACTGGG 
AACGCCAGCG 
AGGTCGCGAT 
CCGGGCTGGC 
ATCCGTTATT 
ACGCGATCCT 
CGTTTCGCGA 
AGAAACAGGA 
CCGACAATGT 
TCCTGCAACA 
AGTGGAACAC 
AGGTTGCCCG 
ATCCACCGCA 
GACACGATAC 
GTGAGCCGGA 
GCGATAACAG 
ATAAAACGCC 
GTGAAGAGCG 
AAGCACGCAT 
CCTGGTGCTA 
TTTGTGAAGG 
TAAAAAACCT 
TT7VACTTGTT 
CAAATAAAGC 
CTTATCATGT 



TGCACAGAAC 
TTGCTTTCTC 
ACCTAAGGAA 
TTTCCAAGAA 
AAGAAATATG 
CAAATAAAAA 
CAATAGGCTT 
CGAATAAGAC 
CAGAAAGAGA 
CATCCTGAGG 
CATTGAGGGG 
TTAAATAGAA 
AGCAACTTAT 
TTTGAGACTA 
TGTTGCTGTT 
GTTTGAAGTG 
TAGAACATAT 
TTTATTACCT 
ACTAATCCTA 
TTATATAACG 
CACATGATTT 
GCAACTTTTC 
GATTTCCTGC 
TTCTAGTTAT 
GCGAGATTTT 
ATTTTTATCT 
GCTGAAGCTG 
CACGCAACTG 
TTGGCTGACA 
TCTGGTAGCC 
TATTGCTGAT 
ACCTGACTGT 
TAATCCTCTA 
CAGCAGGGCA 
ACTGGAACGG 
CG7yU\GCTGT 
CTCATTAACC 
AGCACAGGGA 
CTTGCTAAGT 
CAGCCGCGCC 
AAAACAGGCG 
TAATCTGGCA 
TAACACGCCG 
CCAGTGGATT 
GCTGTCATTA 
AAATGCGCAG 
ACCCGCTTGC 
CGCCTGAATA 
AACCTTACTT 
CCCACACCTC 
TATTGCAGCT 
ATTTTTTTCA 
CTGGATCGAT 



ATAGTGGTCC 
TAACATCTCT 
CACAA7VAAAT 
TGGAAAAGAA 
AGGCACACAA 
TAGACAAACA 
AATGTGAATA 
AAAATATAAA 
ATCTTGAGAG 
CCTGAGCTAT 
ACAAGAAAGC 
TGATTTAAAA 
CAAGGAATTA 
CATTTGCTGG 
ATTCGGTTAC 
ATGTTTCATG 
AAAGTCACAC 
TCAGTTTCAA 
TCCAAATGCA 
ATTGTATTTA 
TATTCAAATT 
CTTACCTCAT 
TCTGTTAAGT 
AAAGGCAGTG 
CAGGAGCTAA 
CTTCTGATTC 
GAAAGTGTGG 
ATGCAGGATG 
CCGCGCGGTG 
GACGGATTGC 
GTCGACGAGC 
GCAATAACCG 
AAAACTGGCG 
GGAGGGTCAA 
GTGCTTAATT 
TCATTAACGC 
GGTGCGGTAA 
ATGCCGGAGC 
TTGCATAACG 
ACCCCGTTAT 
TATGGTGTGA 
AATCTCGGCG 
CCAGGTGGTG 
CAGGTTTCGC 
AATACGCCGC 
GGCATGTGTT 
AGTTTGTAAG 
AGTGATAATA 
CTGTGGTGTG 
CCCCTGAACC 
TATAATGGTT 
CTGCATTCTA 
CCCCGGGTAC 



ACCTTGCACA 
GCAATGAAGC 
TTTCTACATA 
ATGTTCTGAC 
GAAAATATTT 
TAGTTAATTG 
AAATGTTAAA 
AATAAGGACT 
AATGTGTTGT 
TACTGACATT 
TGCTCTCATG 
TATGGAGCTG 
CAGATTCCTT 
GATTTTATTG 
TTCAAGAGAA 
ATTGAAATTT 
AGAAGTGACA 
TGGAAATAAT 
CAGTGATACA 
TCAATCAATT 
TACTCTAATC 
CCAGCTGGGC 
ATAAAATAAT 
CTTGCTTCTT 
GGAAGCTAAA 
CGTTAACCCC 
TGATTGTCAG 
TCACCCCAGA 
GTGAGCTAAT 
TGGCGAAAAA 
GTACCCGTAA 
TACATACCCA 
TTTGCCAACT 
TTGCTGACTT 
TTCCGCAATC 
AGGCATTACC 
GCCTCGCATC 
CGGGGTGGGG 
CGCAATTTTA 
TAGATTTGAT 
CATTACCCAC 
GCGCACTGGA 
AACTGGTGTT 
TGGTCTTCCA 
CCGGAGAGGT 
CGTTGGCAGG 
GTATAAGGCA 
AGCGGATGAA 
ACATAATTGG 
TGAAACATAA 
ACAAATAAAG 
GTTGTGGTTT 



// 
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Figure 22; Nucleic acid sequence of the SV40/APPA+intron plasmid (SEP ID NO:6) 



LOCUS SV40/APPA 5421 bp DNA CIRCULAR SYN 14-APR-2000 

DEFINITION Ligation of SV40 promoter /enhancer into CAT/APPA+int ron 

ACCESSION SV4 0/APPA 

REFERENCE 1 (bases 1 Lu 5421) 

SOURCE synthetic construct. 

ORGANISM synthetic construct 
artificial sequence, 
KEYWORDS SV40 promoter /enhancer , acid glucose-l-phosphatase ; appA gene; 
periplasmic phosphoanhydride phosphohydrolase ; artificial 
sequence ; 

AUTHORS Golovan, S., Forsberg, C.W., Phillips, J, 

JOURNAL Unpublished. 

DEFINITION E. coli periplasmic phosphoanhydride phosphohydrolase (appA) 
gene , 

ACCESSION M58708 L03370 L03371 L03372 L03373 L03374 L03375 

VERSION M58708.1 GI:145283 

SOURCE Escherichia coli DNA. 

ORGANISM Escherichia coli 

Bacteria; Proteobacteria ; gamma subdivision; Enterobacteriaceae ; 

Escherichia. 

REFERENCE 1 (bases 40 1337) 

AUTHORS Dassa,J,, Marck,C. and Boquet,P.L. 

TITLE The complete nucleotide sequence of the Escherichia coli gene appA 

reveals significant homology between pH 2.5 acid phosphatase 
and glucose-l-phosphatase 

JOURNAL J. Bacteriol. 172 (9), 5497-5500 (1990) 

MEDLINE 90368616 

FEATURES Location/Qualifiers 
Source 40 1337 

/organism="Escherichia coli" 

/db_xref="taxon: 5 62" 
sig_peptide 40.. 105 

/gene="appA" 
CDS 40 1337 

/gene="appA" 
/standard_name="acid phosphatase/phytase " 

/transl_table=ll 

/product="periplasmic phosphoanhydride phosphohydrolase" 
/protein_id="AAA7 2 08 6. 1" 
/db_xref="GI : 145285" 

mat__peptide 106 1334 

/gene="appA" 
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Figure 22A: 



/product = "periplasmic phosphoanhydride phosphohydrolase ' 

mutation replace(46.. 48,"gcg changed to gcc") 

/gene="appA" 

/s tandard_name="A3 mutant" 

/note="created by site directed mutagenesis" 
/phenotype="silent mutation" 
mutation replace ( 1320 .. 1322 , " cog changed to ccc") 

/gene^"appA" 

/standard_name=" P428 mutant" 

/note="created by site directed mutagenesis" 
/phenotype=" silent mutation " 
mutation replace ( 132 3 1325 , " gcg changed to get") 

/gene="appA" 

/standard_name=" A429 mutant" 

/note="created by site directed mutagenesis" 
/phenotype=" silent mutation " 



DEFINITION Plasmid pBLCAT3 (bases 2200 to 4924) 



ACCESSION 

VERSION 

SOURCE 

ORGANISM 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

REFERENCE 
AUTHORS 
TITLE 

for 

regulatory 

JOURNAL 
MEDLINE 
COMMENT 
experiments 



FEATURES 

source 



X64409 

X64409.1 GI:58163 
synthetic construct, 
synthetic construct 
artificial sequence. 

1 (bases 2200 to 4924) 
Luckow, B . H . R, 

Direct Submission 

Submitted (06-FEB-1992 ) B.H.R. Luckow, German Cancer Res 
Center, Im Neuenheimer Feld 280, W-6900 Heidelberg, FRG 

2 (bases 2200 to 4924) 
Luckow, B. and Schutz,G. 

CAT constructions with multiple unique restriction sites 
the functional analysis of eukaryotic promoters and 
elements 

Nucleic Acids Res. 15 (13), 5490 (1987) 
87260024 

Promoterless CAT vector for transient transfection 

with eukaryotic cells. Allows the analysis of foreign 
promoters and enhancers . 

Location/Qualifiers 

2200 to 4924 

/organism=" synthetic construct " 
/db xref="taxon: 32630" 



SV40 t intron 

polyA_signal 

CDS 



1380. .1993 

/note="SV40 signals" 
1990. .2230 

/note="SV40 signals" 
complement ( 3471 . . 4 317 ) 
/codon_start=l 
/trans l_table=ll 
/gene="Amp " 

/product^ "bet a- lactamase" 
/protein_id="CAA45753 . 1" 
/db xref="GI : 58165" 
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Figure 22B: 



SV40 promoter/enhancer 5023.. 5402 
/note="SV40 signals 



BASE COUNT 
ORIGIN 

1 



1413 a 1321 c 1331 g 1355 t 



CGAGATTTTC 
61 TTTTTATCTC 
121 CTGAAGCTGG 
181 ACGCAACTGA 
241 TGGCTGACAC 
301 CTGGTAGCCG 

3 61 ATTGCTGATG 

4 21 CCTGACTGTG 
4 81 AATCCTCTAA 
541 AGCAGGGCAG 
601 CTGGAACGGG 
6 61 GAAAGCTGTT 
721 TCATTAACCG 
781 GCACAGGGAA 
841 TTGCTAAGTT 
901 AGCCGCGCCA 
9 61 AACAGGCGTA 

1021 ATCTGGCAAA 
1081 ACACGCCGCC 
1141 AGTGGATTCA 
12 01 TGTCATTAAA 
12 61 ATGCGCAGGG 
1321 CCGCTTGCAG 
1381 AAGTGATAAT 

14 41 TCTGTGGTGT 
1501 TAAAATTTTT 

15 61 TTCCAACCTA 
1621 TGTTTTGCTC 
1681 CTACTCCTCC 
1741 TAAGTTTTTT 
18 01 CCACAAAGGA 
18 61 TTATAAGTAG 
1921 ATAGAGTGTC 
1981 GTAAAGGGGT 
2 041 CATACCACAT 
2101 CTGAAACATA 
2161 TACAAATAAA 
2221 AGTTGTGGTT 
22 81 CCGAGCTCGA 
2341 ACAATTCCAC 
2 4 01 GTGAGCTAAC 
2 4 61 TCGTGCCAGC 
2521 CGCTCTTCCG 
2 581 GTATCAGCTC 
2 641 AAGAACATGT 
2701 GCGTTTTTCC 
2 7 61 AGGTGGCGAA 
2821 GTGCGCTCTC 
28 81 GGAAGCGTGG 
2 941 CGCTCCAAGC 
3001 GGTAACTATC 
30 61 ACTGGTAACA 



AGGAGCTAAG 
TTCTGATTCC 
AAAGTGTGGT 
TGCAGGATGT 
CGCGNGGTGG 
ACGGATTGCT 
TCGACGAGCG 
CAATAACCGT 
AAACTGGCGT 
GAGGGTCAAT 
TGCTTAATTT 
CATTAACGCA 
GTGCGGTAAG 
TGCCGGAGCC 
TGCATT^CGC 
CCCCGTTATT 
TGGTGTGACA 
TCTCGGCGGC 
AGGTGGTGAA 
GGTTTCGCTG 
TACGCCGCCC 
CATGTGTTCG 
TTTGTAAGGC 
AAGCGGATGA 
GACATAATTG 
7U\GTGTATAA 
TGGAACTGAT 
AGAAGAAATG 
AAAAAAGAAG 
GAGTCATGCT 
AAAAGCTGCA 
GCATAACAGT 
TGCTATTAAT 
TAATAAGGAA 
TTGTAGAGGT 
AAATGAATGC 
GCAATAGCAT 
TGTCCAAACT 
ATTCGTAATC 
ACAACATACG 
TCACATTAAT 
TGCATTAATG 
CTTCCTCGCT 
ACTCAAAGGC 
GAGCAAAAGG 
ATAGGCTCCG 
ACCCGACAGG 
CTGTTCCGAC 
CGCTTTCTCA 
TGGGCTGTGT 
GTCTTGAGTC 
GGATTAGCAG 



GAAGCTAAAA 
GTTAACCCCG 
GATTGTCAGT 
CACCCCAGAC 
TGAGCTAATC 
GGCGAAAAAG 
TACCCGTAAA 
ACATACCCAG 
TTGCCAACTG 
TGCTGACTTT 
TCCGCAATCA 
GGCATTACCA 
CCTCGCATCA 
GGGGTGGGGA 
GCAATTTTAT 
AGATTTGATC 
TTACCCACTT 
GCACTGGAGC 
CTGGTGTTTG 
GTCTTCCAGA 
GGAGAGGTGA 
TTGGCAGGTT 
AGTTATTGGT 
ATGGCAGAAA 
GACAAACTAC 
TGTGTTAAAC 
GAATGGGAGC 
CCATCTAGTG 
AGAAAGGTAG 
GTGTTTAGTA 
CTGCTATACA 
TATAATCATA 
AACTATGCTC 
TATTTGATGT 
TTTACTTGCT 
AATTGTTGTT 
CACAAATTTC 
CATCAATGTA 
ATGGTCATAG 
AGCCGGAAGC 
TGCGTTGCGC 
AATCGGCCAA 
CACTGACTCG 
GGTAATACGG 
CCAGCAAAAG 
CCCCCCTGAC 
ACTATAAAGA 
CCTGCCGCTT 
ATGCTCACGC 
GCACGAACCC 
CAACCCGGTA 
AGCGAGGTAT 
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GCCGCCACCA 
CAATCTGCAT 
CGTCATGGTG 
GCATGGCCAA 
GCCTATCTCG 
GGCTGCCCGC 
ACAGGCGAAG 
GCAGATACGT 
GATAACGCGA 
ACCGGGCATC 
AACTTGTGCC 
TCGGAACTCA 
ATGCTGACGG 
AGGATCACCG 
TTGCTACAAC 
AAGACAGCGT 
CAGTGCTGTT 
TCAACTGGAC 
AACGCTGGCG 
CTTTACAGCA 
AACTGACCCT 
TTACGCAAAT 
GCCCTTAAAC 
TTCGCCGGAT 
CTACAGAGAT 
TACTGATTCT 
AGTGGTGGAA 
ATGATGAGGC 
AAGACCCC7UV 
ATAGAACTCT 
AGAAAATTAT 
ACATACTGTT 
AAAAATTGTG 
ATAGTGCCTT 
TTAAAAAACC 
GTTAACTTGT 
ACAAATAAAG 
TCTTATCATG 
CTGTTTCCTG 
ATAAAGTGTA 
TCACTGCCCG 
CGCGCGGGGA 
CTGCGCTCGG 
TTATCCACAG 
GCCAGGAACC 
GAGCATCACA 
TACCAGGCGT 
ACCGGATACC 
TGTAGGTATC 
CCCGTTCAGC 
AGACACGACT 
GTAGGCGGTG 



TGAAAGCCAT 
TCGCTCAGAG 
TGCGTGCTCC 
CCTGGCCGGT 
GACATTACCA 
AGTCTGGTCA 
CCTTCGCCGC 
CCAGTCCCGA 
ACGTGACTGA 
GGCAAACGGC 
TTAAACGTGA 
AGGTGAGCGC 
AGATATTTCT 
ATTCACACCA 
GCACGCCAGA 
TGACGCCCCA 
TATCGCCGGA 
GCTTCCCGGT 
TCGGCTAAGC 
GATGCGTGAT 
GGCAGGATGT 
CGTGAATGAA 
GCCTGGTGCT 
CTTTGTGAAG 
TTAAAGCTCT 
AATTGTTTGT 
TGCCTTTAAT 
TACTGCTGAC 
GGACTTTCCT 
TGCTTGCTTT 
GGAAAAATAT 
TTTTCTTACT 
TACCTTTAGC 
GACTAGAGAT 
TCCCACACCT 
TTATTGCAGC 
CATTTTTTTC 
TCTGGATCGA 
TGTGAAATTG 
AAGCCTGGGG 
CTTTCCAGTC 
GAGGCGGTTT 
TCGTTCGGCT 
AATCAGGGGA 
GTAAAAAGGC 
AAAATCGACG 
TTCCCCCTGG 
TGTCCGCCTT 
TCAGTTCGGT 
CCGACCGCTG 
TATCGCCACT 
CTACAGAGTT 



CTTAATCCCA 
TGAGCCGGAG 
AACCAAGGCC 
AAAACTGGGT 
ACGCCAGCGT 
GGTCGCGATT 
CGGGCTGGCA 
TCCGTTATTT 
CGCGATCCTC 
GTTTCGCGAA 
GAAACAGGAC 
CGACAATGTC 
CCTGCAACAA 
GTGGAACACC 
GGTTGCCCGC 
CCACCGCAAA 
CACGATACTA 
CAGCCGGATA 
GATAACAGCC 
AAAACGCCGC 
GAAGAGCGAA 
GCACGCATAC 
ACGCCTGAAT 
GAACCTTACT 
AAGGTAAATA 
GTATTTTAGA 
GAGGAAAACC 
TCTCAACATT 
TCAGAATTGC 
GCTATTTACA 
TCTGTAACCT 
CCACACAGGC 
TTTTTAATTT 
CATAATCAGG 
CCCCCTGAAC 
TTATAATGGT 
ACTGCATTCT 
TCCCCGGGTA 
TTATCCGCTC 
TGCCTAATGA 
GGGAuAACCTG 
GCGTATTGGG 
GCGGCGAGCG 
TAACGCAGGA 
CGCGTTGCTG 
CTCAAGTCAG 
AAGCTCCCTC 
TCTCCCTTCG 
GTAGGTCGTT 
CGCCTTATCC 
GGCAGCAGCC 
CTTGAAGTGG 



ure 22C: 



3121 TGGCCTAACT ACGGCTACAC TAGAAGGACA GTATTTGGTA TCTGCGCTCT GCTGAAGCCA 

3181 GTTACCTTCG GAAAAAGAGT TGGTAGCTCT TGATCCGGCA AACAAACCAC CGCTGGTAGC 

3241 GGTGGTTTTT TTGTTTGCAA GCAGCAGATT ACGCGCAGAA AAAAAGGATC TCAAGAAGAT 

3 301 CCTTTGATCT TTTCTACGGG GTCTGACGCT CAGTCCAACG AAAACTCACG TTAAGGGATT 

33 61 TTGGTCATGA GATTATCAAA AAGGATCTTC ACCTAGATCC TTTTAAATTA AAAATGAAGT 

34 21 TTTAAATCAA TCTAAAGTAT ATATGAGTAA ACTTGGTCTG ACAGTTACCA ATGCTTAATC 
34 81 AGTGAGGCAC CTATCTCAGC GATCTGTCTA TTTCGTTCAT CCATAGTTGC CTGACTCCCC 
3541 GTCGTGTAGA TAACTACGAT ACGGGAGGGC TTACCATCTG GCCCCAGTGC TGCAATGATA 
3 601 CCGCGAGACC CACGCTCACC GGCTCCAGAT TTATCAGCAA TATVACCAGCC AGCCGGAAGG 
3 661 GCCGAGCGCA GAAGTGGTCC TGCAACTTTA TCCGCCTCCA TCCAGTCTAT TAATTGTTGC 
3721 CGGGAAGCTA GAGTAAGTAG TTCGCCAGTT AATAGTTTGC GCAACGTTGT TGCCATTGCT 
37 81 ACAGGCATCG TGGTGTCACG CTCGTCGTTT GGTATGGCTT CATTCAGCTC CGGTTCCCAA 
3841 CGATCAAGGC GAGTTACATG ATCCCCCATG TTGTGCAAAA AAGCGGTTAG CTCCTTCGGT 

3 901 CCTCCGATCG TTGTCAGAAG TAAGTTGGCC GCAGTGTTAT CACTCATGGT TATGGCAGCA 
3961 CTGCATAATT CTCTTACTGT CATGCCATCC GTAAGATGCT TTTCTGTGAC TGGTGAGTAC 

4 021 TCAACCAAGT CATTCTGAGA ATAGTGTATG CGGCGACCGA GTTGCTCTTG CCCGGCGTCA 
4 081 ATACGGGATA ATACCGCGCC ACATAGCAGA ACTTTAAAAG TGCTCATCAT TGGAAAACGT 
4141 TCTTCGGGGC GAAAACTCTC AAGGATCTTA CCGCTGTTGA GATCCAGTTC GATGTAACCC 
4 201 ACTCGTGCAC CCAACTGATC TTCAGCATCT TTTACTTTCA CCAGCGTTTC TGGGTGAGCA 
4 2 61 AAAACAGGAA GGCAAAATGC CGCAAAAAAG GGAATAAGGG CGACACGGAA ATGTTGAATA 
4 321 CTCATACTCT TCCTTTTTCA ATATTATTGA AGCATTTATC AGGGTTATTG TCTCATGAGC 
4 381 GGATACATAT TTGAATGTAT TTAGAAAAAT AAACAAATAG GGGTTCCGCG CACATTTCCC 
444 1 CGAAAAGTGC CACCTGACGT CTAAGAAACC ATTATTATCA TGACATTAAC CTATAAAAAT 
4 501 AGGCGTATCA CGAGGCCCTT TCGTCTCGCG CGTTTCGGTG ATGACGGTGA AAACCTCTGA 
4 561 CACATGCAGC TCCCGGAGAC GGTCACAGCT TGTCTGTAAG CGGATGCCGG GAGCAGACAA 
4 621 GCCCGTCAGG GCGCGTCAGC GGGTGTTGGC GGGTGTCGGG GCTGGCTTAA CTATGCGGCA 
4 681 TCAGAGCAGA TTGTACTGAG AGTGCACCAT ATGCGGTGTG AAATACCGCA CAGATGCGTA 
474 1 AGGAGAAAAT ACCGCATCAG GCGCCATTCG CCATTCAGGC TGCGCAACTG TTGGGAAGGG 
4 801 CGATCGGTGC GGGCCTCTTC GCTATTACGC CAGCTGGCGA AAGGGGGATG TGCTGCAAGG 
4 8 61 CGATTAAGTT GGGTAACGCC AGGGTTTTCC CAGTCACGAC GTTGTAAAAC GACGGCCAGT 
4 921 GCCAAGCTTT ACACTTTATG CTTCCGGCTC GTATGTTGTG TGGAATTGTG AGCGGATAAC 
4 981 AATTTCACAC AGGAAACAGC TATGACCATG ATTACGAATT CGGCGCAGCA CCATGGCCTG 
50 41 AAATAACCTC TGAAAGAGGA ACTTGGTTAG GTACCTTCTG AGGCGGAAAG AACCAGCTGT 
5101 GGAATGTGTG TCAGTTAGGG TGTGGAAAGT CCCCAGGCTC CCCAGCAGGC AGAAGTATGC 
5161 AAAGCATGCA TCTCAATTAG TCAGCAACCA GGTGTGGj\AA GTCCCCAGGC TCCCCAGCAG 
52 21 GCAGAAGTAT GCAAAGCATG CATCTCAATT AGTCAGCi\AC CATAGTCCCG CCCCTAACTC 
52 81 CGCCCATCCC GCCCCTAACT CCGCCCAGTT CCGCCCATTC TCCGCCCCAT GGCTGACTAA 
5341 TTTTTTTTAT TTATGCAGAG GCCGAGGCCG CCTCGGCCTC TGAGCTATTC CAGAAGTAGT 
54 01 GAGGAGGCTC GAGGAGCTTG G 
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Figure 23, The nucleic acid sequence of the Lama2/APPA transgene used for the generation 
of transgenic mice and transgenic pigs fSEO ID NO: 7) 

tx-ansgene 177 32 bp DNA SYN 14 -APR- 20 00 

Lama-appA cut Xhol-. 20623 to Notl.. 17732 
transgene 

parotiid secretory protein; acid glucose- 1 -phosphatase ; appA 
gene; 

periplasmic phosphoanhydride phosphohydrolase ; artificial 
sequence; 
cloning vector 
1 (bases 1 to 17732) 

Golovan, S., Forsberg, C.W., Phillips, J. 
Unpublished, 



DEFINITION M. musculus Psp gene for parotid secretory protein - 
ACCESSION X68699 
VERSION X68699,l GI:53809 
SOURCE house mouse. 

ORGANISM Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata ; Vertebrata; Mammalia; 

Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus . 

REFERENCE 1 (bases 3777 to 5332;) 

AtJTHORS Svendsen,P-, Laursen,*!. , Krogh-Pedersen,H- and Hjorth^J-P. 
TITLE Novel salivary, gland specific binding elements located in 

the PSP proximal enhancer core 
JOURNAL Nucleic Acids Res, 26 (11), 2761-2770 (1998) 
MEDLINE 98256451 

REFERENCE 2 (bases 7147 to 12653; 13952 to 17731) 
AUTHORS Mikkelsen,T.R. 
TITLE Direct Submission 

JOURNAL Submitted (07 -OCT- 1992) T.R. Mikkelsen, Department of 

Molecular Biology, Univerisity of Aarhus , CP Hollers Alle 
130, 8000 Aarhus , DENMARK 

REiFERENCE 3 (bases 7147 to X26S3; 13952 to 17731) 

AUTHORS Laursen J, Hjorth JP 

TITLE A cassette for high-level expression in the mouse salivary 

Gene 1997 Oct 1;198 (1-2) i367-72 
9370303 

Location/ Qualifiers 
source l.to 12653; 13952 to 17731 

/organism^'^Mus musculus" 
/strain= "C3H/AS " 
/db_xr e f = " t axon : 1 0 O 9 0 " 
/ chr omo s ome = " 2 " 

/map= "Estimate : 69 cM from centromere" 
/clones: "Lambda YPl, Lambda YP3 , Lambda yP7" 
/clone_lib=" Lambda -PHAGE (Lambda L47 .1) " 
/germline 
/note=" Allele: b" 

misc_feature 3777-5332 
/gene="PSP" 

/function=" salivary gland specific positive acting 
regulatory region" . 
enhancer 7147.. 8724 
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DEFINITION 

ACCESSION 

KEYWORDS 



REFERENCE 
AUTHORS 
JOURNAL 

FEATURES 



JOUPJTAL 
MEDLINE 

FEATURES 
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Fi gure 23A: 



exon 



exon 



/evidence-experimental 
11778 . . 11824 
/gene="Psp" 
/note= " exon a " 
/nuTnber=l 

/evidence=experimental 
12626. . 14190 
/gene=:"Psp" 

/note="exon b fused with exons h and i' 



misc feature 



12644-12652 

/function^" consensus sequence for initiation in higher 
eukaryotes 

13952-13965 



mi s c_f e a tur e 

/function-" M13mplB polylinker" 

DEFINITION E. coli periplasmic phosphoanhydride phosphohydrolase {appA) 
gene, 

ACCESSION M5870B L.03370 L03371 li03372 L03373 L03374 3003375 

VERSION M58708.1 GI:145283 

SOURCE Escherichia coli DNA. 

ORGANISM Escherichia coli 

Bacteria; Proteobacteria; ganuna subdivision; 
Enterobacteriaceae ; 

Escherichia. 



REFERENCE 



AUTHORS 
TITLE 



JOURNAL 
MEDLINE 



FEATURES 

Source 



s ig___peptide 
/gene= " appA" 
CDS 



(bases 12653.. 13951) 

Dassa^J-, Marck,C. and Boquet^P.L- 

The complete nucleotide sequence of the Escheri<:hia coli 
gene appA reveals significant homology between pH 2-5 
acid phosphatase and glucose-1 -phosphatase 

J. Bacteriol. 172 (9) . 5497-5500 (1990) 

90368616 

Locat ion/Qual if iers 

12653 - .13951 
/organisms "Escherichia coli" 
/ db_xr e f = " t axon : 5 6 2 " 
12653. .12718 



12653 13951 
/ gene= " appA" 

/ standard_name= " ac id phosphatase /phyta^e ' 
/ trans l_table= 11 

/products "periplasmic phosphoanhydride 
phosphohydrolase " 
/protein_id=-AAA72 0e6 . 1" 
/db xref«"GI: 145285" 
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Figure 23B: 



niat_j>ept:ide 



mutation 



mutation 



mutation 



12719 1394B 
/gene="appA*' 

/product=i**periplasmic phosphoanhydride 
phosphohydrolase " 

replace (12659. . 12661, "gcg changed to gcc") 
/gene=«appA" 

/ s tandard_name= " A3 mutant " 

/note=" created by site directed mutagenesis" 
/citation^ [3] 

/phenotype=" silent mutation" , 

replace (13 934 . -13 93 6, " ccg changed to ccc") 

/gene="appA" 

/standard_name=" P42 8 mutant" 

/note=" created by site directed mutagenesis" 
/citation- (3] 

/phenotype=" silent mutation " 

replace (13937 13939, " gcg changed to get") 

/genes^appA" 

/standard_name=" A4 2 9 mutant" 

/note=" created by site directed mutagenesis" 
/citation= [3] 

/phenotype=" silent mutation " 



BASE COUNT 
ORIGIN 

1 
€1 
121 
181 
241 
301 
361 
421 
481 
541 
601 
661 
721 
781 
841 
901 
961 
1021 
1081 
114-1 
1201 
1261 
1321 



4719 a 4125 C 4168 g 4719 t 



TCGAGAGTAT 
ATCTAAACTA 
TGTTGAACAA 
CTGAGGAGAC 
AGGGTGGTTC 
AAGCTACCCC 
GCCGGACAGT 
AGGGATTGAG 
ACAAAGCTGC 
ACAGCATAAT 
ATAAAAGGAC 
TTTAAGTAGG 
GTCTCTTACT 
GGACAATATA 
CACC7VAGACT 
GTGGTGGTGA 
CACACTGGAG 
GCGGGGCGTG 
TCTGAGTTCC 
AAAAACCCTG 
ACCAAACCAA 
TCCTAGATAT 
ACTACACTGT 



CTTTGTCAGC 
ATTAATTAAT 
GTTCTCCAAA 
ACCTGCATCT 
TGTGGGACAG 
AAACGACAGA 
GAGACAGACA 
AGACCCTGAC 
CAAAGACCAA 
AAGCAGAGTG 
AGTATTACAG 
GTAAAGTACT 
GTTTAAATGA 
TATTTAGAGA 
GCAGCACACC 
AGATGTACTA 
CAACCACTGT 
GTGGCATACA 
AGGCCAGCCT 
CCTTGATTAA 
ACCAAACCAG 
ATACCCAATG 
TCACCACAGC 



TGTGCCTCCA 
CCCTCACCCG 
GGAGAGATAC 
GACTAAGAAG 
TAGAAAATCG 
GATTGTCAGT 
CACCTACTCA 
AGGCGCAAGG 
AGACTTGTTC 
TACTCTGATT 
ATTTTGTTGT 
CTTTAAAAAT 
TTTTTATTTT 
AAGATGGTTA 
CCTGTCAGAT 
AAGGGAAACA 
GGAAATCAGT 
CTTTTATTCC 
GGTCTATAGC 
ACCAAACCAA 
ACCAAACCAA 
GAGACTAAGT 
CAGGCTGTGG 
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ACAAAGGGGT 
CAAATCTTTC 
AGATGAGTGC 
AGCCACGGTG 
AGAGGCATGT 
CAGGCCAATC 
GTTGGAGGAA 
CCCTAACACA 
TCCATTAGAA 
GGAGAACTTT 
ACACTGCTGT 
GGGTCCTAGA 
GTTTAATATG 
GCTGTCAOAA 
GGCTGTGATC 
CACACACACA 
ATGAATGGTC 
CAGCACTGGG 
ACAGGTTCTA 
ACCAAACCAA 
AACACTGAAG 
CAGCAAGACA 
AAOCAGCCTG 



ACTGTTGCCC 
AGTCACTAAG 
GTATAGGGTG 
TTAGTTGAAT 
GCCGTTTAGT 
CGTTTCGAGT 
GGATGAGAAC 
CACACCTACC 
ATGACAGCTG 
AATGTGTTTC 
TACATGTGGG 
TATTTTTTCC 
GAGGAAAAAG 
AAATATGCAA 
AAGAAAATAA 
CACACACACA 
CTCAAAAACC 
GAGGCAGAGG 
GGACAGCCAG 
ACCAAACCAA 
ATAGAACTTC 
CCTGCACAGC 
AGT^TCCATG 



ACATAGAAAG 
TTAGCACGAT 
GACCTGGCTG 
GGTGTGGAGT 
GAACTGATGG 
TTGATGGGCA 
AATGGCCAGC 
ACCTCACTTG 
GCTTGACCCG 
ATTCAGTATT 
GCAGTGTGTC 
TTTAACTCAA 
AAGCGTAAAT 
OT^CAAAATCA 
ATGACAATGA 
CACACACACA 
TGAAGATAGA 
CAGGTGGATC 
GGCTACACAG 
ACCAAACCAA 
AGTATTCCAT 
CATGTTCACT 
t.TAAATGAAT 



Figure 23C: 



13 81 GGATAGGTAA CTTTCAAGGT AAATGGACTC TGCTGTGTAC ATGCCTCACA TTCTGTTTAT 
1441 TCATTTTTCT TTATGAGGTG TCCATTCAGG AGTCACATGG TAGTTCTATT TT<:AGTCTTC 
1501 TGAAGATACT ACACTGGTCC CCACAGTTTA CACTTTTATC AGCAGTGAAT AAGGGTTCCT 
1561 CTATCCTTAC CATCATTTGT TGTAATTTTT CTTGATGACC CTCTTTCTGA CAGGGATAX3G 
1621 ATGTAATATC AGTGTGAGGA AGTACAACTT GTTTTCTAAG TATTTATTGG CCCCTTGCAT 
1681 TTCTTCTTTT GAAAACTGTC GGTTCCTGAC ATCTGCTCAG GTATTCATTG GATGTTGtTT 
1741 CTTTGGTGTT TGAGTTCTTA TGAATTCTAG ATGTTJ^AATC CCTGCCTGTG GTTC TCTCC C 
1801 ATTCTGTAGG CTGCCTCCTC ACCCTGGCAA TTGTTGTCCT TGTTTTGCAG AAACTTTTGA 
1861 CTTCATGGT^ TCTCATTTGT CAGTTTTCCC TCCTCTGCTA TAGCCTGAGC TAATGCACTG 
1921 GTTTTTACAG AGCCCTGGTC TATGCCTTTA TCCTCCTCTG GCAGCTTCGG AGTTTCATTT 
1981 CTTACATTTA GATCTTTGAT CCACTTTGAA CAAGTTTTGG AGCAGGGTGA GAGATACGAA 
2 041 TCTAGTTCCA TTCTTCCATA TGTGATCCTA GTTTACATAG CATCGTTGGT TGAAGAGGTT 
2101 TTATTTTATT TTTAAATAAT GTGTCATAAA AAACGAGGTG GTTGTAGCAG TGTGG ATTTG 
2161 TTTCTTTGTC CTTTGATCTA CAGGTCTTGT TTTGTGTCAG TCTCATGATG TTTTATTGCT 
2221 ATGGCTCTGT CATACAGTCT GAGGTCAGGT ATTGTGATAT ACCTTCAGTA TTGCTCGCTC 
22 81 AGACTCAGGT TTGCTTTGGC CAGGAGTCAT CTTACTCAGT GCTCTTAGAG CTCCCCCAGC 
2341 ATGTAGCTGC TACTATTCTT AGTTGATAAA TCAGGAAACT GGGGCTCAGA GAGATTAACT 
24 01 GTCTTGAACT ACTTCTGGGG AGGTGAA?VCG TGGAGACACT AAACTGTGTT TACCCTGTAC 

24 61 TGCTCCAGTA GCTGTCGGGT GCTGGGCTAC AGCAAAGCAC CTATACTATA TATTACTCAG 
2521 GAGGTGGAAA AACTCAGCCT CCCTTGGGGT TCCCAAGCTC CCAGGTGTCC AGTCACTGCT 

25 81 GGAAACCTCA TGGAGTCTGA AAGGAAGGGT TGAGGGTACA TGGGGCAGCX3 ATGAGGAGCC 
2 641 TGGGGCTGGG ATCTCCCAAA CACCTGGATA TCCAGATGCC ACTGGGTCAG GGGGAGTTGG 
2 7 01 GAACAGAGTT (3GGATGTGCA TGGACCTGTG ACAAGGCCAG GGCCAGGGGG AGGATAACTC 
27 61 TGGCTTTACT AATTTGCGAA AGTCCTTAGC TTAGCAGCAG TTGTCTCGGA GCACAGAGGG 
2 821 GCCTTCTGTA AGAGGCTCAG GCAGTGCCGC TCTGTAGGCG AAGGTCTTCT CCATGTTCCC 
2 8 81 CATGGTGGTT CTTGATGAAA GAGACAGTCC TTGGCTCCAA ACTGGTTTAT TGATTGTTCA 

2 941 TTGTGGAAAA TGGGTGCACA CCACCTTCTC AGGGTGGACC AGAGATCAAA TACCTTTTGC. 

3 0 0i AGGGAGGAAT ATCTGGGAAG GGACGCTTAC TGGCTAAACC CTCAGGGCCT CTAGATACAT 
3 061 CATTAGCATG GAGAACTCTG TTCTGGGCTA CATGACCACA GGCCACATTT CCACAAGCCA 
3121 CATGTGGGAA GTGTGGCACA TGTTCTAGGC CAGGAATCTG GTAGGGAGCG TGGAGCCACC 
3181 TACCATCCCA GGTGGGTGCC TGGGTGCCAG GGACCCTGAA CCCGCTCAAC CTTACCAAGT 

32 41 TTCCTGGCAG GGTCCACTGT CCTACACAGA AGCTGGAGGA GGTGTGAGGG TTGTGTCTTT 

33 01 GTGGAATGTC CCATGCTGCT TGGGGCTCAG TTTCTCCACC TGTACCTCAT TGGTTTGGGT 
3 361 ATAAAAAGTG GGGATACTTT ATTATTCTCT GACTCGGTCC TGA<3GAAAAA GCATCGTGGC 

34 21 AGTCCAGGAA CCACACCGTG AGGTTCCTGC ACTGAAGGGA CTCCCTAAGT CTCTGGAGTC 
34 81 TCTCCCCTTC ACAGAGCTGC CAAAGTCTAG GTrCTTTTGA GGATAACAGA GCCATGCTTG 
3541 GTAAGCAGAC AACAGCATTT GTTTACTCAA CCTTCTTTTG TCAGCTCCCT CTTCATAAAC 
3601 AAGTTGAGAC ACCATGCTGG CTTGAGGAAG ACTTCTAAAG CCAGAGAACT GTGCAAGGAA 
36 61 GAAGAAGAAG GGGCAAGTGG AGTTAGCGTG GATGTAGCCC TCAAAGTCTC CAGAGACCAG 
3721 CCATGAAGGC TCAAGTGGAG GGCAAGACCT GCAGCAGCCA AGCATCTGGC AGGAGAGGAT 
3781 CCTGGGAACC CCTCTACCAT GACACACATT CTTCCTGCAG GTCACACTTA ATAGGCCATT 

3 841 TCTTATTTGG ATCTATCATG GTGTTCTGTG CGAGATTAAT GAGGTGTTAT GCTGCGAACA 
3901 GAAAGTTATA TAAAAACAAG TCCCCCCCCC TTGTCACTGC TGCTAAGAAT GTAGCAGAAA 
3961 TTGTCTCAAG TGTCTCTCTA ATCAGAAACA ATAAAGGTCT CCTTGGATTC AAGCCCTCCA 
4021 GTTTCCTCCT TCCTTGCTGA GCCTTGGACA CCCATACAAA CCTCCTGGAT GCTACAGCTC 
4081 TGGGCAGAGA CTCCAAGGTG GGGAGAGACT GATGGTACAA AAGGAAAATA CTTGTTPGGG 
4141 GGTACACCCA CTCCTCTGCC TGTGTGGTTC CTGCA<5TCAG TCCTGCAGAC AGGCCCTCAG 
4201 TGGGTCTTCC ATGGGCAACA COCAGAGGGA GGCAATGGAT GGGAATAGCC ACACCCTGGT 
4261 TAGTTTACCC CGGCCATGCT CTCTGCTCTT CATCCCTCCT CTGCCCTCTG CCACGGCTTT 
4321 CTCTGCAGGA ATCATATCTT CATATTGGCC CACAGGTGTT CTCCTCACCC TAGCTATGAT 
43 81 GTTTACTTTA GAGTGACCTT AGC AGGGCTG GTGGGAATGA GTTCTAGAAG GCTCAGGGAG 
4441 ATGCTAGGGA AGAAACGTCT TCTAACTACT GAGGTTACTA AGTTCCTGGT GGTTGTCTCT 
4501 GCCTTTCCCT TGTTAAAGTC ACCTTGAA<3T TAGTGCAGAA GAAATCAGAG CCCAGTCAO^ 
4561 GAGTAAATAT GGTCCTGAAG ATTTCCTTTG AGTGCCCAGA ATCCATGACA TTTCAAGAGC 
4621 CCTCTTTGTA CCTTAAGTCA TTTGGGGTTG TATCTTCTGC TTGATGTATG TGTGTGTGTT 
4681 TATCAAAGAG TGAGATGGTT ACATAAGAGG TGCTCTAAAG GACAGA<3AGG ATTTGCAATT 
4741 GTGGCATGTG ACATCCTCAG GCCTTGCTCT GGTGCCAGGA GGAACTGATG CA<3AAAA<3AG 

4 801 TAA<3A<3GTCA TTTCCTGGAG GCTGTCACTA TAGAGGAGAT CTTAC-^yGTGC ATTGCCTGCT 

54/5S . 



ure 23D: 



4 861 CCAGGCCCTG CCTGAGGATA GACATGTGCT GACTGCAACT GAAACAGAGG CTTGGGATGG 
4 921 AGAGTTAGGT TCACAGAAGG GAGGGTGGGA GATGGATGCT TGCTGGGTTC TGGGTCTCAT 

4 981 CACCAGCTCC TGACCACCCG GTCAGCCCAT GTGCTTATTC CATAGCTTTC TTTTGCTATG 

5 041 TTTACTCAGT GTGGTGTTTG TTGGGACCCA GCAGAAGCCA GTCCCAGGCT GACAGCTGTG 
5101 GATACACAGG GCAGCATGAG GGTCCTCAGC CTGAAGCAGT CAGGCTGGCA OAAGAGAAAG 
5161 ACCAGCACAC ATTCCTTCAA CCAACTATGT CTTGAAAAAC AAACATATTA TATCACATAT 
5221 ATTGCATTTA TGAGACAGCT AAAATGTACT CGGGTAGCAT GACTCCAGGT GGGGATATCT 
5281 GCAAGTGCCA TGAGTGGCAG AGGGACAGCC AATGTGAGGC AAGAAGGAAT TCTGGCTCAA 
5341 CACAGCTTAG CTCCCTGGTG TTGGTTCAAA CTTTGAGAGT TTGAOCACAA GCACTTTATT 
54 01 TTTGACATAT TTAAACAGAG CACAACTTTG GGAAAAAGTT TTCTTATGAA AATTATCACA 
5461 ATAAAGCTTA AGGCATGACT ACATTAAAAT GCCTTTGCAA AGTATATGTG CCCTCTTCCA 
5521 CAAGAATGGT TCTATTGACT GAGAAATAAT GTTCAGGATA AAGAT?CCAGG AAGAAAAGAT 
5581 CAGGGATAAG TAAAATACTA AACTCTTTTG CAAAGTACAT AGACCCTCTT TCATAACAAT 
5 641 GGGTTCTATT GACTGACAAG CACTGCTCAG GAGTTGGGAA AGAGTCTAGC ATAAGCACGA 
57 01 TAGCCTGGAG ACTCTAGTGA GGTCTAGTCT TACAGACAGC AAAAATCAGC AGGTTACAAA 
57 61 CTACATTCAT TTCCAGTTTT CTGATCAGGC ACAGGTATGA ATCCCTTCTG TTGAAGAGAA 
5 821 AAGTCCATGT GTTTAAAATA TCTGGTTTCT CCAGTGCTAT TAGCGAGAAG ACTTGAGCCC 
5 881 TATACAACTC CCACCTGGAG TGACATCCTG TCTTCATGGT ATATTACATA CCTAGACACG 

5 941 CTCATCTCAC AGACTTAGGA CTTTGTCTTC TGATCTCCAT TTCTGATCCC ACTTCCACCT 
6001 TTGCCTTGAT AGTGTCATTT TCTTCACTGC CTTGGTGACA ACCATGTTAT CCTCTGTGTA 

6 061 TTTGAGTGTT ACCATTTTCA GATTTTACCT GTATGCAAGA TCACACAGTC TTTGTCTTTC 
6121 TGTCTGGATG CATGCTAATC TCTACACAAC AACCCTTCCC CGTCACTCAG ATCTTCCTCC 
6181 ATTAACACAT ACATGGTGCT GAAGAGGCTA GGGAGCTTCC CTTCAGTGGG GAGCTAGCTG 
6241 GCTATTGGGC CTTTTTGACT GTCCAGGAAG GCCCCCAATT GCTGAGACAA GAACTTAGAT 

6 3 01 TCTTCATTAT TGACTCTAAC TCATGTATCA AGCAGAAGCT AATGAATAGT TATC?lACAGG 

63 61 ATCAGAGGTT CCAGTGTAAG ACACTTTGAC ATGAAAGAAC GGAGGAAGGA CAGATGGATG 
6421 CATAAAAGCA GGACCACTGC CCCAGGAAGG TCCTGGAAAC TGATGCAGGG CAAAGGACAG 

64 81 GTTATAAACC AAATCTTAGG GAGTCAGGAA GAGCACAGAG GAGCTCAAGC AACTGACCAC 
6541 TGCTTAGGGG CTACCAACCC AATCCTCCCT GTGGGAACAG CTAAGCTATC AGCCAAGGGT 
6601 AATAAACAGG CAGGACCTGT GGATGACATG GAGAGCATAG GGACCCTGGG TCCAGCCTTT 
6661 AGCACCTGCA CTCTCAGGAT ACTCCACCAT TGTGTCTTAG AGAGCCTAGG GATACTGGGT 
6721 CCAGCCTTTG GTACCTTCAC TCTCAGGGTA CCCCATCACT GTGTCTTGGA GAGCCTAGGC 
67 81 ACCCTGGGTC CAGCCTTCAG TACCTGCGCT CTCAGGACAC CCCACCATTG TCTCTTGCCC 
6841 CGTCTCTTCT TCCTCTTCCT CCCTTTCATT GTCTCTTCTC TGTTTCTT1?C TTGACTCTCC 
6901 TTTCCCCTCA CACCCTCACT CTAGTTCTCC CCTTCCCTCT CTGCATCACC CTATTCTCTC 
6961 TGTGGTCCCT CCACTTTCCT TTATCTCTCA TGCTTCTCTC CTCCCTCAAA TACTTGTCAC 

7 021 CCACTATACT TCAGGGGCCA GCTCTAGTGA CAAAGCTGTT AATAGCAAGA CTCTCAGATC 
7081 TCCAACGGCT CAGAGGAGCC AGACCCACCA AGAACTCTCT CCAGGTCCAA TTTCAGGTTC 
7141 CTTCGAAAGC TTTCAGCAAA TGCTCAGGGA ACATGCCACT AACAA3GAAGA TGCAAATTCC 
7201 AGTTGAGAGT GGGAAAGGCC CTTGCGTAGG TCCCATCTTC CAGGCCAAGG TCAGAGGGGC 
7261 TCTGTGTAAT CCGGATTGAC AGGGCTCAGA ACAATGTTTT GTTTTTAAGG TTTATTTATt 
7321 TTAGGTGTTA GTGTCTTTGC TTGCATGACC TTATGTGCAT CATGTGTGTG CAGGTTCCTG 
7381 ATGACAGTAG AGGAGGGCTT TGAATCCCTG GGGATAGGAA GTTACAGGAA ATTATAAGCT 
7441 GCTTTGTGGG TCTTCTAGCT TTCCCAACAG AAGTGAATGC TCTTCACCAC TGAGCCATCT 
7501 CTCTAGGCCC AAGAGACATT GCTTTATGGA TATAATTGTG TGTGTGTGTC AACATTGAGG 
7561 AAAGGGAAAT AAAAAAAAAA CTTCAGCCGC TAAGGTTGTA CAGTTTCACT AATTGCTACT 
7621 TTTAGTTGTG ATAAAATGGC AGGTGCTTCA ACATTTATAT ATACAAAAAC TTCCCTGCTG 
7 681 GTGGTTCAAC TGTGAGAAGT GGGGTAAGTG GGTGAGTTCT CTTTTTCTGT CTCTGTCTCT 
7741 GTCTCTCTCC TTCCATTCTT TCTTAAAGGA AATAAACATT GCAGCTGGGT TATAGCTCAT 
7 8 01 CAATATGGAA GTTACAGAAG TGAAAAAAGG CATTGCCTTG GTGGGTGGTG TTACCAGCTG 
7 861 ATTTTTGGTT GTCCTGCAAG GAGGTCTGGG GACTGGCTGC TCTGTGTCTG TCTGTATGAG 
7921 TGAGGGAAGT CTGGGGAGCA GATTCCCTAA CCTTCAGCCT GGCCTGGTTC CTGAGTGAAC 

7 981 CCAGCCTCTC TGGTCCTAGT AGCTTTTTCC AT^CAGGAAT CTGAGTGGTG ACAGGGAACA 
8041 AGTACCAGCC CATTGCTTAA GTGCXIAGGGT TAGTGAGGGC AGGAAGCTGC CATAGCTGGG 
8101 ATTAGTAGTT GTATTGGATG TAGGAAGTCC TATCCTGGGA CACCTAATOG TTAATGCTTC 
8161 ACTGGAGATT TTCAATGA.GA. AATTTATCCC ACGGCCCATA TGGCGCCATC CTTTTGTCTC 

8 221 'CAACAGCCAA GTATTTTCCA TTAGAGGAGA CTTCCTGTAC ACTTGATGGA TGCTCA.TTCC 
82 81 AAGGTGA.CTT GGGGCAGTCA GTACAGACTT GGGATGACCT CTGACA^GGCT AACCTCTCCC 
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8341 CAACAAGGGC CCTCTATGTT TGCTATGTAA TGTAATGTCA GACATTGTCA GGAGTGTCCG 
84 01 CAGCACAGCC TGCCCAGTGT GAGGGCTCTC ATAGGTTTCC CACTGTCTTA TCTACACAGG 

84 61 GATAACGAGG AGGTAAGCTG CAGTTCCCAG TCTCACTTCA CAGAGGAAGA GATAAGCCCA 
8S21 TCCCAGGTCA TGTAGCCAGC AGTGGAAAGA ATGAGGATTT GAACTCAGGT CTTCCAAGTC 

85 81 CCATTGATAG CATCTCCTCA CAAGTCCCTT GCCACCCTCA CGATGCCTTA GACACTTGCC 
8 641 TGCCCTTTAT ACTAAGGAGA TGCAGGTACA AGGGGTTTAC CCATGTAGCA GCTGAGGCAG 
87 01 CTGGGGATAG ATACCAGCAG CAGGCCTGAT GTCACCACTC TAACTCCAGC ATCCCCAGTC 
87 61 TGTGTTCCTG GAGTGTGAAA ATCCCTACTT AACAAGATTG TGCAACAGTC CTTGGCTCTG 
8 8 21 TGACCCATAG CTGGAAACAG GATTCTCATT GATTTGTGGA ACATGGTGGC AGCCAGCCAA 
8 8 81 AAAGAGGGTC TGCATACAGA AGACACGTGT GGCAAGGCCA CAGCAGACTC TGACTACCTT 
8 941 AGCTTACAGA ATTACAAGGT CATAATGTCC TCTGCTTTGG TCACCTCATG TTAAGGACAG 
90 01 GCCCTAATGA AGATGGGGCA GAAGACTGAA GGAATGGCCA ACCAATAACT GGCCCAACTT 
9061 GAGACCCATC CTACAGGCAA GCATCAATTC CTGACACTAC TAATGATACT CTGTTATGCT 
9121 TGCAGACAGA AGCCTAGCAT AACTATCCTC CGAGAGGTCC ACCCAGCAAC TGAGTGAAAC 
9181 AGAAAAAGAT ATCCACAGGC AAACAGTGGA TGGAGGTCAG GGACTATTAT GGGAOAGCTG 
9241 TGGGAAGGAT TAAAAACCCT GAAGGGGATA GGAACCCCAC AGGAAGACCA ACAGAGTCAA 
9301 CTAAGAGACC TGTGGGAGCT CTCAGAGACT GAGCCACCAA GCAAAGAGCA TACACAGGCC 

93 61 GGTCCGAGGC ACCTGGCACG TGTGAAGCAG ACATGCAGCT CAGTCTCCAT GTAGGTCCTC 

94 21 CAATAAGCGG TAGCCTGACT GCAGTATCCA ATCCCCAACA GGGCTGCATA GTCTGGCCTC 
94 81 AGTGGGGGAG GATGCCCCTA ATCCTGCAGA GACTTGATGA GTGGAGAGCT ATCCAGGGGiS 
9541 AACCCACCCT CTCTGAGAAG GG;^TGGGGA TGGGGGAGGG ACTCTGTGAA GAGGGGACAA 
9601 GGACAAACAA GAACCTCAAA TAGGTCAGGC CCTAAAGGCT TGCTAAGTAG CAGTGGCCCA 

96 61 GCTCTGTCCT GTTCCTCAGC CCAAGGCTCA GCTCCCACCT GTTTCTGTGT TTTTCTGGCT 
9721 TTTCATGGGC CTAGGACTTG GTGACCAGTT CAAACAATGG GGCCTGTGGA AGACACAATA 

97 81 TACAAGACTA GGGACATTCC TGTTCTGCTG ACTATCCATA GCCTGATGTA GGTGGAAGGA 
9841 CCCAATCACT GGATTTCTAC CCTTGCACAA CCTTGACAGC TGAGGGCCTC TCAGAAACCT 
99 01 ATTTCTTCCA CTGAAAAATG AGACTCTCAA ATGAACGTCG TGAGAATCAT CAGGCTTATT 
9961 AAAGAGGTGT ATCTAACCTG AATGGCAAGC AGACAGCAGG CAAATGTCTG TATCAACCTC 

10021 TAGGAAGGAC AAGT^CTGCT CACTGCTGCC CCCCAGGAGG CCATTTGCTG AAACAGCTGC 
10081 TCTCCTGCTG GTGCACAGGC CCTGCCTTCT CATTGCAGCC ACAGCCCCTT CCTGTCTGAA 
10141 CCTCCTGTCA GGTCACTGGG AAACAGATCA AGATGGAACA GG ACAGCTCC TGATGGTAAA 
10201 TAAAAAACAG TGGTCATGGC TATTCATAGG GGTTTATGCT TCTTCAGTCC ACACTGTGAA 
10261 GAGCTGTGGG CATGAACCAC AGTGTTCGAG GTAGAGTTGG GGTTCTGAAA TTCACAGTGG 
10321 GGTGAGCTCA GTAAATGTGA GCTGGAGGTC ACTCGTGAGA CACACAGTCC TGCtGCTTCT 
10381 GTTCCCAATA TCCTGAGGAG ACGACACATC TACTTTGTTC AGAGGCCACA GTCTAGTTGA 
10441 CCTGAGAGTT ACCAGTTTCT TATTTGTGTG TGTGTGTGTG TGTGTGTGTG TGTGTGTGTG 
10501 TGTTGTTCGT GTGTGAGTGC AGGTGCACAT ATGATAGCGT ACACGTTGAG GTCAGAGGAT 
10561 AACTATCAGG CGTTGTCCCC TCCTACTTTT CCTCGGACTC TGGAGAACAA ACATGGGTCC 
10621 TTATTCCAGG GGAGCAAGTC GCTGTTGGCT GACACATCTT GCTCACATAC ATTTTACCTA 
10681 GACAATGGAG CCTCCATCAG AGTATTACTT TAGCTCCTCA CCGATGGCAA TGCAGCACCT 
10741 CTCTACCCAC ATAGGAGTTG GGTCTCCACA CACCCCCACA CCCCCTTCAC CAAAACGTTT 
108 01 TCAGTTACTT TATCTGGTAA AGTTCATCAG AGAATGAAGC CAGTATTAAG AACATGGAAT 
10861 CATTTGGGAA CCTGGATCTA GCAATACCCC ACCCTAGATG GAGTTGCTGA GTTTTCACCT 
10921 CAGATTATAA TTCCCCCCTA GCTTCTATGG TTTATTCTGA AAGCAGGGGA ACTCGATTCC 
10981 TCCCTTTGGA CCACAGACAT CCTGGCTTGT GAATTCACAT GTCATCTACT GCTAATCCAT 
11041 TGGTAGTATG TGGCTCACAG AGACACACTA CAGTCATGGC CAATGTCAAG GTAGGACAGA 
11101 TGTGAATCAT TCCCCCAGTC CTGCTGTTTT CATGACTAAC CCTCCTCAGC ACAGTGACCA 
11161 TGAACCTACT TTTCCCCTCC TTTTATTTTT AGAATTGCTG GAMTTTCTA TTTTGAGAAA 
11221 TAATAGCCTT GGGCAGGATT AAAC7VAAATC ATCTAGAAAG CTGGTTTAAA ATACAGATGG 
11261 TTGAGTCAGT GAAAGAGTGA GGAATGTCAT TATTGGCCCC TCACAGAGGC TGGCTCACTC 

113 41 CAGCAGAGGT GGTTGAAGCT CTTGGACACG GGTCAGGTGC ATAGGAAAGG TNGTCTGGGA 

114 01 CACTGAGAAC CACAATTGAA CAAACAGAAC TGTTGGCTTT TTTTTTTTTA AATGAGTTCT 
114 61 CAAAAAATGA CTGGCTAGCT TAGGCAAATA CTTCGAGCCA ACCCAACAGA ACATTCTTCC 
11521 ATTGATTCAT TCTGGATCTT CTTTCTAGAC AATACTGAAC TGACGCCTTG TTGGCAGTCT 
11581 CAAGTTTGAC AACATAGGGC TTTGAACTTG GCACAAGGTC CATCACTGTC ACCCAAGCAT 
11641 CCTGGGTGAC CTTTGGGTTG GAATATCTTG GCTAACCTTA GATATTTTCT TTGGAGTATC 
117 01 TTTAGAACAT CCAGGAAATA GGGCTTGATT CTCATCCTGG GACCAGAATA TAAGTCACCC 
117 61 TAGAATCCCA GGAGATCGTG CAGAGAAACA AGGATCTCTC TCGTGTGC^fe^ CCTTCTTCAA 
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11821 AGCAGTGAGT AGTGACTCCA CTAAACTGAG 
118 81 GGGCAAGAAG CAGAGGGAAG GCACTGTTTG 
11941 GAAGACATAG ATGACATTGT GTCAGACTAA 
12 001 AGGGATCAGA TTTTATTCAT CAATGACTTG 
12 061 GAGTGTAGGC TCAATAATGA CAGAAGAGAT 
12121 AAGTTTGGTG GAGAAAGGCA GTGGGGAACC 
12181 GCTGGCTTCA TAGAAGGTGT GAAGTTTTGC 
12 24 1 AAAGTGAGCA GGCAAGGCAG GGAATGTGTT 
123 01 TTATAGATAC ACACACATTT GAACCTCATT 

123 61 TCTCTTAACT GCTAAGCACA ATGACTTCCA 

124 21 TCATTTTTCA TTGTGGCTGA ATAAAATTCC 
124 81 CTGAGGGCAG GCATATCCCC TGGCTCCATT 
12 541 TCTTGTTGAA AGGCAAGCGT GAGAGAGGCA 
12 601 CCTGCTATGA CTCTCCATTT GTCAGAACCA 
12 661 CATCTTAATC CCATTTTTAT CTCTTCTGAT 
12 721 GAGTGAGCCG GAGCTGAAGC TGGAAAGTGT 
127 81 TCCAACCAAG GCCACGCAAC TGATGCAGGA 
12 841 GGTAAAACTG GGTTGGCTGA CACCGCGCGG 
12 901 CCAACGCCAG CGTCTGGTAG CCGACGGATT 

12 961 TCAGGTCGCG ATTATTGCTG ATGTCGACGA 

13 021 CGCCGGGCTG GCACCTGACT GTGCAATAAC 
13 0 81 CGATCCGTTA TTTAATCCTC TAAAAACTGG 
13141 TGACGCGATC CTCAGCAGGG CAGGAGGGTC 

132 01 GGCGTTTCGC GAACTGGAAC GGGTGCTTAA 
13 2 61 TGAGAAACAG GACGAAAGCT GTTCATTAAC 
13321 CGCCGACAAT GTCTCATTAA CCGGTGCGGT 

133 81 TCTCCTGCAA CAAGCACAGG GAATGCCGGA 
13441 CCAGTGGAAC ACCTTGCTAA GTTTGCATAA 
13501 AGAGGTTGCC CGCAGCCGCG CCACCCCGTT 
13561 CCATCCACCG CAAAAACAGG CGTATGGTGT 
13621 CGGACACGAT ACTAATCTGG CAAATCTCGG 
13681 CGGTCAGCCG GATAACACGC CGCCAGGTGG 
13741 AAGCGATAAC AGCCAGTGGA TTCAGGTTTC 
13 801 TGATAAAACG CCGCTGTCAT TAAATACGCC 
13861 ATGTGAAGAG CGAAATGCGC AGGGCATGTG 
13 921 TGAAGCACGC ATACCCGCrT GCAGTTTGTA 

13 981 AAGAGGAAGA ACAGAAGGAT GCCACAACTC 

14 041 TTACTTCTGA TGGCATTTCC CTCTAGAAAG 
14101 GACCACCCAA AGGACCCTCC CAAATTCTCT 
14161 CACCATCCCA GAATTAAAAT CCTAACTGCA 
14221 AATAAGAGTT GTTGGCAGTG CCAGGCGTGG 
142 81 AGGCAGAGGC AGGCGGATTT CTGAGTTCGA 
14341 GACAGCCAGG GCTATACAGA GAAACCCTGT 
144 01 GTTGGCAGAG TGTGGGTTAT ATACCAGGTG 
144 61 CCAGAAGGAA CTTAGAGGAT AGCTCATAAC 
14521 ATTGAGAGAG TGGGCACACA GCCACTGTGT 
14 581 TACATGCATA AGTGTATATT GGCGCCATCC 
14641 CGGGGTTAGG TGGCCATGGC CTTTCCTGCC 
147 01 TATGCTCTCT TAACTCTTCC ATTGCTACTT 

147 61 CCTTGGGTAC ATCAGTGATC CTGGTGATAT 
14 821 GAGGCTGCAA CTAAAGAGGT CTTCTTAATA 

148 81 AGAAGTTCAC AGAGGTGAAG TGATTCATGT 
14 941 GGATTATCTG ACTCTACTCT AACTTTTATG 
150 01 TTCCTGTGCT TCA.GCTCTGG GAGACTCCCA 
150 61 GACTCTGACA CTCTGCATTG ATTAATTAGC 
15121 TTGTTTCACT TTGCATATAG GCTA.TGAAGG 
15181 'gaggcaatcc ACCTCTCTCA GGAAOCCTCT 
152 41 AACTGTAGGC CCAGTCCTTG GTGTCCAAAA 
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TTCCCATCTG AGAGTCCACA GGAGGCTTTG 
TGTTGGTAAA GTTTTGACTC TAACAAATTT 
CAACAACCTA GACTCATGTG GGTTCTGTTT 
TCTTAGTGTA TAGAGAAAGG CTTCCTACOXj 
AGCTATTTCC CCTAGGGACT GTGCTGCTCC 
TAGATGTGCT CTCTGGGGAG GGGGTCTGAA 
TGAAACATCT AAACAGAATT ATAGCTTAGG 
GCATATGTAT ATGTACATGA ATATATTATG 
TGCAGATGAC AGAAAATAGG TTATTTTGGC 
GTTCCATCCA TTTCCTGAAA TGCCACAATT 
ATTGCAGACT GGGCCCTACT TCATCCACTC 
TCTTACCTAT TGTGAAGAGA AGTGCAACTG 
GGCACTAATT GTGGGTTTTT GTTTCITCTT 
AAGATOGATA AAAGCCGCCA CCATGAAAGC 
TCCGTTAACC CCGCAATCTG CATTCGCTCA 
GGTGATTGTC AGTCGTCATG GTGTGCGTGC 
TGTCACCCCA GACGCATGGC CAACCTGGCC 
TGGTGAGCTA ATCGCCTATC TCGGACATTA 
GCTGGCGAAA AAOGGCTGCC OGCAGTCTGG 
GCGTACCCGT AAAACAGGCG AAGCCTTCGC 
CGTACATACC CAGGCAGATA CGTCCAGTCC 
CGTTTGCCAA CTGGATAACG CXSAACGTGAC 
AATTGCTGAC TTTACCGGGC ATCGGCAAAC 
TTTTCGGCAA TCAAACTTGT GCCTTAAAGG 
GCAGGCATTA CCATGGGAAC TCAAGGTGAG 
AAGCCTCGCA TCAATGCTGA CGGAGATATT 
GCCGGGGTGG GGAAGGATCA CCGATTCACA 
CGCGCAATTT TATTTGCTAC AACGCACGCC 
ATTAGATTTG ATCAAGACAG CGTTGACGCC 
GACATTACCC ACTTCAGTGC TGTTTATCGC 
CGGCGCACTG GAGCTCAACT GGACGCTTCC 
TGAACTGGTG TTTGAACGCT GGCGTCGGCT 
GCTGGTCTTC CAGACTTTAC AGCAGATGCG 
GCCCGGAGAG GTGAAACTGA CCCTGGCAGG 
TTCGTTGGCA GGTTTTAGGC AAATCGTGAA 
AGGTACCCGG GGATCACT^C TTGCCCTCTG 
TCCTGCTQGC TACTCTGCAG TGGTTTCATC 
TGCTACTATC ATCCACACAT TTCTACGTGA 
TCCTCTCTGA GTAGTCTCCA CACCTGTTAC 
CTCTGGCGTG TGACTTGCCT CAGTCCTTGC 
TGGCGCACGC CTTTAATTCC AGCACTTCGG 
GGCCAGCCTG GTCTACAGAG TGAGTTCCJ^ 
GTCGAAAAAC CAAAAAAAAA AAAAAAAGTT 
GAGATTTCAA ATGAGTGGCT GAAGCTCTAG 
TTAAAAAGAA ATGTAGAGAG TAGCAGAAAC 
GAATGTGGCA GAACACAATC CAGCCAOCTA 
TGACTGATGA GACACAGGAA AACAGATAGA 
TGCCTCTTCC TAAGQGTCAT CTCAAGACCT 
AGCTTCTAGA TATCACCTCC AGATTAGTCT 
CCAGGGCTTC CTGATTCCAT CTTTGTCATA 
CTTCACACCG TGATGCCAAA AGGAAGACAC 
AGGACATACA GTGAGCAAGC ATGAGGGTCC 
TAAATGTGCT TTATGCCATT AACACTGTCA 
AGCACTCTTA GGCACAAGCC ACAATTAAGG 
ATGGTCGTGT CTATGTTTCC AGATTCATCA 
GTGTGAGGAA ATTTTTTGGG GACAGAATTG 
ATCTGGAAAA GCTTACAACT CAGGGACAGT 
TGGGTTTTAT GGTTTGAATC-^GCAAAGCCT 
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153 01 TCCATGTGCT CAAAGGTTTG AACATGGAGC CTCCTCCTGG TAACACTGTA TTGGAGGCTT 
1S361 TTGAGACTGG ATGCTCTTTG GTCCCATGTT TTGCTACATC ATCTGTCAAG ATATGACCCA 
15421 GGCATGCTAC CAGCTACCAC AGACTATGCC TCTCCAGCTT TCATGTTCTC CCCACCATGA 

154 81 TAGACTTGTA TCTCCTAAAA ATGGAATCAA AGCAAACTTT TCCTGCATTA AGTTTTTTTT 
15541 TTTCTGTTAA GTGTTTGGTC ACAGGGACAA GAAAACACTC AATACAGATA ATTAGTACCA 
15 601 GAGTTGAGGT TCATTGCTCT AGCAAGTTGG ATCAAATTTT TAGGGCTTTG GAACTGATTT 
15 661 ATAAGAGACA TGTAGAAGAG TCTGAAGCTG TGGGCTACAG AAGTGTCACC AGTTTTTAAG 
15 721 AATAGTTTAA TACACCATGG GAATTGTGAA AATCAGAATG CTCACACAAA GGCAGACAGG 
15 781 AAAACGTGAG CATGTGGCGT GTGAGAGGGC ATAAGAAGGA ACCTAGGGGG AAATGAGCTA 
15 841 GAAGCCATTC GGCTACGTTA GGGAACGTGT GTGGCTGTGC TTGGCCCATG CCCTGGCAAT 
15 901 CTGAATGAGG CCAAATTTTA AAGGAGTGGA CTAACTCGAT TGTCAGAGAA AATATCAAGA 

15 961 CAGACCACCA CTCAGGCTAT GCCGTGTTTG TGACCGACCA GCTACTCTTA GCCAGCTCTA 
16021 TTGTGAAATT CCAGAGCAAT TATCAGAGCA TGAAGATACA TACAGTTTAG TGAAGXAAGG 

16 081 GGTGTGGGTC CCTAAGTGGA TGGTGCATAA ATCTATGTAG GTGATGGCTA AGTGACACTT 
16141 GATAATCCAA AATATCAGCA ATGTGGAATG TCTTCCAAGG AGACCTGTAG ACACACATTT 
16201 TAGAACTTTG CTCATGGCTG TAATAAATAG CTAGCTAGAA ATCATTTCCT GAAGAGGTTA 
16261 GTCTGAGTTA CGGTTCCAGG GCAAACATTC AGTGATGGCA AGGAAGGCAT TGCAGTCAGG 
16321 AGCCAAAGGT CAGCTGGTCA CATTGCATCA AGAGTAGAGA GTCA<3AGTGT GAGTAGAAAG 
163 81 AGGATACAGG TTATAAAACC TCACTGTCCA CTCTCAGCAA TCCATTTTCT CCTAA7UVGGC 
16441 TTTACCTTCT AAAGATTTTA GTCTTCAAAA CCAGTACCAG TAGCCTGGGA ACAAAAGTTG 
16501 AAACAAATGA GCCTTTGTGG GGCATTTCAC ACTTAAAACA GGGCATCACC TAGGAGGAGC 
16561 CCTGTGTGCA GTAGGAAGTG TGGCCTCTGT GTCAGGAATG CTCAGGCTAA TAAGGGGTCC 
16621 TCTATCTGAG GGACCCTATG AAGATXCAAC AAGTAGTTGT GAGAATTCCG TGTAAATGGA 
166 81 TGCTACCAAT TTGACATTTG TAGACCTGCT ATTGTGTGCT TCTTTATTGG GCTCTCCCAT 
16741 CTCCCAACTT TCCAACCCAT ATTCCACATT AATCCCTTCC ACCAOCATGC AACACTAGGT 

16 8 01 AGGAGAGAAG GAAGGTTAGA AGAGAAAGTG GGTATAGATC TATTTAGACT ACTTCCTGCT 
168 61 GATTAGGGGC AAGTCCAATC GTCATTGTCA GGATACCTCC AACCAGCAAC CAGCAAACCA 
16921 GCAAATCAGA AACAGCAAAA GCAGCCAACA AGGCAGCACT AACCAGCA<3G ATTGGGGTCG 
16981 GTAGCGTGGG AGCAGTCACT ACTGGTCTTC TCATGGCTTT GGCATTAATA CTCTCTCAAG 

17 041 AAATTCCGTA ATTTTTTCCC CACCACCTGA AATTCCGTAA TTTTAAATGC AAACTATCTA 
17101 CAGCTGGCAA AAATCACATC TCTCCTAGAG CACAAGACAA ATCATAGTTA CTGGCTATTT 
17161 GCAATCTGAA GCATCTCAAT ATCCCACAGC TGGGATTAAA ACAAAAACAT ATTCACATCA 
17 221 CATAACTGTT TTTTTTTTCC AATTTTTTAT TAGGTATTTT CTTTATTTAC ATTTCAAATG 
172 81 CTATCCCGAA AGTCCCCTAT ACCCTCCCAC CTCCCTGCTC CCCTACACAC CCACTCCCAC 
17341 TTTTTGACCC TGGAGTTCCC CGGTACTGGG OCATATAAAG TTTGCAAGAC CAAGGGGCCT 
174 01 CTCTTCCCAG TGATGGCCGA CTAAGCCATC TTCTGCTACA TATGCAGATA GAGACACGAG 

174 61 CTCTGGGGGT ACTAGTTAGT TCATATTGTT GTTCCACCTA TAGGGTCGCA <5ACCCCTTCA 
17521 GCTCCTTGGG TACTTTGTCT AGCTCCTCCA CTGGGGGCTC TGTGTTTTAT CTAATAGATG 

175 81 ACTGTGAGCA TCCACTTCTG TATTTGACAG GCACTGGCCT AGCGTCACAT GAGCCACCTA 
17641 TATCAGGGTC CTTTCAGCAA AACCTTGCTG GCATGTGCAA TAGTGTCTGC GTTTGGTGGT 
177 01 TGATTATGGG ATGGATCCAC TAGTTCTAGA GC 
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